CN110669775B - Application of differential proxy technology in enrichment of A.G base substitution cells - Google Patents

Application of differential proxy technology in enrichment of A.G base substitution cells Download PDF

Info

Publication number
CN110669775B
CN110669775B CN201910938672.XA CN201910938672A CN110669775B CN 110669775 B CN110669775 B CN 110669775B CN 201910938672 A CN201910938672 A CN 201910938672A CN 110669775 B CN110669775 B CN 110669775B
Authority
CN
China
Prior art keywords
resistance gene
organism
target sequence
function
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910938672.XA
Other languages
Chinese (zh)
Other versions
CN110669775A (en
Inventor
杨进孝
赵久然
张成伟
徐雯
武莹
吕欣欣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Academy of Agriculture and Forestry Sciences
Original Assignee
Beijing Academy of Agriculture and Forestry Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Academy of Agriculture and Forestry Sciences filed Critical Beijing Academy of Agriculture and Forestry Sciences
Priority to CN201910938672.XA priority Critical patent/CN110669775B/en
Publication of CN110669775A publication Critical patent/CN110669775A/en
Application granted granted Critical
Publication of CN110669775B publication Critical patent/CN110669775B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/65Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression using markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/66General methods for inserting a gene into a vector to form a recombinant vector using cleavage and ligation; Use of non-functional linkers or adaptors, e.g. linkers containing the sequence for a restriction endonuclease
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8201Methods for introducing genetic material into plant cells, e.g. DNA, RNA, stable or transient incorporation, tissue culture methods adapted for transformation
    • C12N15/8209Selection, visualisation of transformants, reporter constructs, e.g. antibiotic resistance markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8216Methods for controlling, regulating or enhancing expression of transgenes in plant cells
    • C12N15/8218Antisense, co-suppression, viral induced gene silencing [VIGS], post-transcriptional induced gene silencing [PTGS]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biomedical Technology (AREA)
  • Biotechnology (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Plant Pathology (AREA)
  • Cell Biology (AREA)
  • Medicinal Chemistry (AREA)
  • Virology (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

The invention discloses an application of a differential agent technology in enrichment of A.G base-substituted cells. The differential agent technology vector comprises esgRNA of a target gene target sequence, sgRNA of a screening agent resistance gene target sequence with target function loss, an A.G base replacement system and a screening agent resistance gene with function loss; the A.G base substitution system can restore the function of the selection agent resistance gene with the loss of function by carrying out A.G base substitution on the selection agent resistance gene target sequence with the loss of function under the guidance of sgRNA of the selection agent resistance gene target sequence with the loss of function. The invention realizes the enrichment of A.G base substitution cells on the cell level and greatly improves the A.G base substitution efficiency.

Description

Application of differential proxy technology in enrichment of A.G base substitution cells
Technical Field
The invention relates to the field of biotechnology, in particular to application of differential agent technology in enrichment of A.G base-substituted cells.
Background
The CRISPR-Cas9 technology has become a powerful genome editing means and is widely applied to many tissues and cells. The CRISPR/Cas9 protein-RNA complex is localized on the target by a guide RNA (guide RNA), cleaved to generate a DNA double strand break (dsDNA break, DSB), and the organism will then instinctively initiate a DNA repair mechanism to repair the DSB. Repair mechanisms are generally of two types, one being non-homologous end joining (NHEJ) and the other being homologous recombination (HDR). In general, NHEJ dominates, and repair produces random indels (insertions or deletions) much higher than precise repair. For base exact substitution, the application of using HDR to achieve base exact substitution is greatly limited because of the low efficiency of HDR and the need for a DNA template.
In 2017, a novel Adenine Base Editor (ABE) was reported by David Liu laboratories. Through seven rounds of evolution, researchers fuse tRNA Adenine deaminase (tRNA Adenine deaminase, ecTadA) derived from Escherichia coli at the 5' end of Cas9 nickase (Cas9n), can directly realize the replacement of a single base A (Adenine, A) to G (Guanine, G) in cells, and do not generate DSB and start HDR repair, so that the base editing efficiency of replacing A with G is greatly improved. The specific process is as follows: when sgRNA containing a genome targeting sequence binds to ecTadA & Cas9n, the complex targets, ecTadA catalyzes adenine deamination of a on unpaired single stranded DNA to Inosine (Inosine, I), I is considered to be G during DNA repair, Cas9n introduces a Cytosine C (Cytosine) that pairs with I upon cleavage of the phosphodiester bond of the paired DNA strands. Finally, C-G pairing is generated in the next repair process, so that A-G conversion is realized.
At present, the research of enriching A.G base substituted cells in plants by reporter gene mediated cell enrichment technology is very limited, and at present, there is no report of utilizing a selection marker in the transformation process to realize the enrichment of A.G base substituted cells on the cellular level and further improve the A.G base substitution efficiency.
Disclosure of Invention
The invention aims to provide application of a differential proxy technology in cell enrichment of A/G base substitution, and the differential proxy technology can realize enrichment of A/G base substitution cells on a cellular level, so that the A/G base substitution efficiency of a target spot is improved.
In order to achieve the above object, the present invention first provides a kit comprising a sgRNA or a biological material related to the sgRNA, an a · G base substitution system, and a selection agent resistance gene for loss of function or a biological material related to the selection agent resistance gene for loss of function;
the sgRNA consists of esgRNA targeting a target gene target sequence and sgRNA targeting the loss-of-function screening agent resistance gene target sequence;
the esgRNA structure of the target gene target sequence is as follows: an RNA-esgRNA backbone transcribed from the target gene sequence;
the sgRNA structure of the target sequence of the screening agent resistance gene targeting the loss of function is as follows: an RNA-sgRNA backbone transcribed from the loss-of-function screener resistance gene target sequence;
the a.g base substitution system comprises Cas9 nuclease or a biological material associated with the Cas9 nuclease and adenine deaminase or a biological material associated with the adenine deaminase;
the A.G base substitution system can restore the function of the screening agent resistance gene with the loss of function by carrying out A.G base substitution on the screening agent resistance gene target sequence with the loss of function under the guidance of sgRNA of the screening agent resistance gene target sequence with the loss of function;
the sgRNA backbone is S1) or S2) or S3):
s1) replacing T in the 2418 th 2493 th site of the sequence 1 with U to obtain an RNA molecule;
s2) carrying out substitution and/or deletion and/or addition of one or more nucleotides on the RNA molecule shown in S1) and having the same function;
s3) and S1) or S2) and has the same function;
the esgRNA backbone is T1) or T2) or T3):
t1) replacing T in the 617-702 th site of the sequence 1 with U to obtain an RNA molecule;
t2) carrying out substitution and/or deletion and/or addition of one or more nucleotides on the RNA molecule shown in T1) and having the same function;
t3) and T1) or T2) and has the same function.
In the kit, the number of target sequences of the target gene to be targeted can be one or two or more; the number of target sequences of the screening agent resistance gene targeting the loss of function may be one or two or more. The size of the target sequence can be 15-25bp, further 18-22bp, and further 20 bp.
The screening agent resistance gene with the loss of function meets the following conditions: the function or activity of the screening agent resistance gene with the function loss is lost, and the function of the screening agent resistance gene with the function loss can be recovered after the A.G base substitution is carried out on the target sequence of the screening agent resistance gene with the function loss. The target sequence of the screening agent resistance gene with the loss of function can be a target sequence on the screening agent resistance gene with the loss of function (positioned in the screening agent resistance gene with the loss of function), and can also be a target sequence additionally added in the screening agent resistance gene with the loss of function or at the 5 'end or the 3' end. When a target sequence (referred to as a surrogate target sequence) is additionally added to the sequence of the non-functional selection agent resistance gene in order that the non-functional selection agent resistance gene can recover the function after the A.G base substitution, the non-functional selection agent resistance gene sequence includes not only the non-functional selection agent resistance gene itself but also the surrogate target sequence and, if necessary, one or two or more bases additionally added in order to ensure that the selection agent resistance gene can be translated in a normal reading frame after the addition of the surrogate target sequence.
Further, the selection agent resistance gene with loss of function may be a sequence obtained by deleting the initiation codon (e.g., ATG) of the selection agent resistance gene and adding a surrogate target sequence to the 5' end of the selection agent resistance gene. The surrogate target sequence satisfies the following conditions: the A.G base substitution of the surrogate target sequence by the A.G base substitution system can restore the function of the selection agent resistance gene with lost function. The agent target sequence consists of a screening agent resistance gene target sequence with function loss and a PAM sequence in sequence. It should be noted that, in order to ensure that the screener resistance gene with the start codon removed can be translated in normal reading frame after the surrogate target sequence is added, one or two or more bases may be added between the surrogate target sequence and the screener resistance gene with the start codon removed.
In one embodiment of the invention, the surrogate target sequence is sequence 5. The sequence of the target point of the screening agent resistance gene with the loss of function is the 1 st to 20 th sites of the sequence 5. The A.G base substitution system can perform A.G base substitution on the proxy target sequence under the guidance of sgRNA targeting the proxy target sequence, so that the 6 th base A of the proxy target sequence is mutated into the base G to form ATG, and further the function of the screening marker gene is recovered. It should be noted that, in order to ensure that the screener resistance gene with the start codon removed can be translated in normal reading frame after the surrogate target sequence is added, a base C is added between the surrogate target sequence and the screener resistance gene with the start codon removed.
Further, the screening agent resistance gene may be a screening agent resistance gene commonly used in the art, such as Bar/PAT glufosinate-N-acetyltransferase gene, PMI 6-phosphomannose isomerase gene, EPSPS 5-enolpyruvylshikimate-3-phosphate synthase gene, and the like. In one embodiment of the invention, the screener resistance gene is a hygromycin resistance gene.
In the above kit, the Cas9 nuclease includes Cas9 nuclease or its variant, dead inactivating enzyme (dead Cas9, dCas9) or its variant, nickase (Cas9 nickase, Cas9n) or its variant from different sources. The Cas9 nucleases or variants thereof of different origins include Cas9 (such as SaCas9, SaCas9-KKH and the like) derived from bacteria, Cas9-PAM variants (such as xCas9, NG Cas9, Cas9-VQR, Cas9-VRER and the like), Cas9 high fidelity enzyme variants (such as HypaCas9, eSpCas9(1.1), Cas9-HF1 and the like) and the like. In a specific embodiment of the invention, the Cas9 nuclease is Cas9n, specifically SpCas9n protein. In another embodiment of the invention, the Cas9 nuclease is Cas9n, in particular HypaCas9n protein.
The adenine deaminase can be adenine deaminase of different sources, such as an ecTadA protein derived from Escherichia coli, or adenine deaminase derived from a plant endogenous source (such as a rice endogenous OsTadA, an Arabidopsis thaliana derived AtTadA, and the like). In a particular embodiment of the invention, the adenine deaminase is an ecTadA protein derived from escherichia coli.
Further, the SpCas9n protein is a1) or a2) or A3):
A1) the amino acid sequence is a protein shown in a sequence 3;
A2) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence shown in the sequence 3 in the sequence table and has the same function;
A3) a fusion protein obtained by connecting a label to the N terminal or/and the C terminal of A1) or A2);
the biological material related to the SpCas9n is any one of B1) to B5):
B1) a nucleic acid molecule encoding the SpCas9 n;
B2) an expression cassette comprising the nucleic acid molecule of B1);
B3) a recombinant vector containing the nucleic acid molecule of B1) or a recombinant vector containing the expression cassette of B2);
B4) a recombinant microorganism containing B1) the nucleic acid molecule, or a recombinant microorganism containing B2) the expression cassette, or a recombinant microorganism containing B3) the recombinant vector;
B5) a transgenic cell line comprising B1) the nucleic acid molecule or a transgenic cell line comprising B2) the expression cassette;
the ecTadA protein is E1) or E2) or E3):
E1) the amino acid sequence is a protein shown in a sequence 2;
E2) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence shown in the sequence 2 in the sequence table and has the same function;
E3) a fusion protein obtained by connecting a label to the N terminal or/and the C terminal of E1) or E2);
the biological material related to said ecTadA protein is any one of F1) to F5):
F1) a nucleic acid molecule encoding said ecTadA protein;
F2) an expression cassette comprising the nucleic acid molecule of F1);
F3) a recombinant vector comprising the nucleic acid molecule of F1) or a recombinant vector comprising the expression cassette of F2);
F4) a recombinant microorganism containing F1) said nucleic acid molecule, or a recombinant microorganism containing F2) said expression cassette, or a recombinant microorganism containing F3) said recombinant vector;
F5) a transgenic cell line comprising the nucleic acid molecule of F1) or a transgenic cell line comprising the expression cassette of F2);
the biological material related to the loss-of-function screener resistance gene is any one of K1) to K4):
K1) an expression cassette containing the loss-of-function selection agent resistance gene;
K2) a recombinant vector containing the selection agent resistance gene having the loss of function, or a recombinant vector containing K1) the expression cassette;
K3) a recombinant microorganism containing the loss-of-function screener resistance gene, or a recombinant microorganism containing K1) the expression cassette, or a recombinant microorganism containing K2) the recombinant vector;
K4) a transgenic cell line containing the loss-of-function screener resistance gene, or a transgenic cell line containing the expression cassette of K1).
In order to facilitate the purification of the proteins A1) and E1), the amino terminal or the carboxyl terminal of the protein consisting of the amino acid sequence shown in the sequence 2 or the sequence 3 in the sequence table is linked with the tags shown in the following table.
Sequence of Table, tag
Label (R) Residue of Sequence of
Poly-Arg 5-6 (typically 5) RRRRR
Poly-His 2-10 (generally 6) HHHHHH
FLAG 8 DYKDDDDK
Strep-tag II 8 WSHPQFEK
c-myc 10 EQKLISEEDL
The protein A2) or E2) is a protein having 75% or more identity to the amino acid sequence of the protein shown in SEQ ID NO. 2 or SEQ ID NO. 3 and having the same function. The identity of 75% or more than 75% is 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity.
The protein in A2) and E2) can be artificially synthesized, or can be obtained by synthesizing the coding gene and then performing biological expression.
A2) and E2) as described above, can be obtained by deleting one or more amino acid residues from the DNA sequence shown at positions 4205-4705 (protein shown in coding sequence 2) of sequence 1 and 5396-9496 (protein shown in coding sequence 3) of sequence 1, and/or by performing missense mutation of one or more base pairs, and/or by linking the coding sequences of the tags shown in the above table to the 5 'end and/or 3' end thereof.
Further, B1) the nucleic acid molecule is B1) or B2) or B3):
b1) a cDNA molecule or DNA molecule shown in No. 5396-9496 of a sequence 1 in a sequence table;
b2) a cDNA or DNA molecule having 75% or more identity to the nucleotide sequence defined in b1) and encoding said SpCas9 n;
b3) a cDNA or DNA molecule hybridizing under stringent conditions to the nucleotide sequence defined in b1) or b2) and encoding the SpCas9 n;
F1) the nucleic acid molecule is f1) or f2) or f 3):
f1) a cDNA molecule or a DNA molecule shown in the 4205-4705 site of the sequence 1 in the sequence table;
f2) a cDNA or DNA molecule having 75% or more identity to the nucleotide sequence defined in f1) and encoding said ecTadA;
f3) a cDNA molecule or DNA molecule which hybridizes under stringent conditions with a nucleotide sequence defined in f1) or f2) and encodes said ecTadA;
K1) the loss-of-function selection agent resistance gene is a DNA molecule shown in the 12278-13324 position of the sequence 1.
Wherein the nucleic acid molecule may be DNA, such as cDNA, genomic DNA or recombinant DNA; the nucleic acid molecule may also be RNA, such as mRNA or hnRNA, etc.
The nucleotide sequence encoding the SpCas9n or the ecadada of the present invention can be easily mutated by one of ordinary skill in the art using known methods, such as directed evolution and point mutation. Those nucleotides which are artificially modified and have 75% or more identity to the nucleotide sequence of the SpCas9n or the ecTadA of the invention are derived from the nucleotide sequence of the invention and are identical to the sequence of the invention as long as they encode the SpCas9n or the ecTadA and have the same function.
The term "identity" as used herein refers to sequence similarity to a native nucleic acid sequence. "identity" includes nucleotide sequences that are 75% or more, or 85% or more, or 90% or more, or 95% or more identical to the nucleotide sequence of a protein consisting of the amino acid sequence shown in coding sequence 2 or 3 of the present invention. Identity can be assessed visually or by computer software. Using computer software, the identity between two or more sequences can be expressed in percent (%), which can be used to assess the identity between related sequences.
The stringent conditions are hybridization and washing of the membrane 2 times, 5min each, at 68 ℃ in a solution of 2 XSSC, 0.1% SDS, and 2 times, 15min each, at 68 ℃ in a solution of 0.5 XSSC, 0.1% SDS; alternatively, hybridization was carried out at 65 ℃ in a solution of 0.1 XSSPE (or 0.1 XSSC), 0.1% SDS, and the membrane was washed.
The above-mentioned identity of 75% or more may be 80%, 85%, 90% or 95% or more.
B2) The expression cassette containing the nucleic acid molecule encoding the SpCas9n protein (SpCas9n gene expression cassette) refers to DNA capable of expressing the SpCas9n protein in host cells, and the DNA may include not only a promoter for starting the transcription of the SpCas9n gene, but also a terminator for terminating the transcription of the SpCas9n gene. Further, the expression cassette may also include an enhancer sequence. The existing expression vector can be used for constructing a recombinant vector containing the SpCas9n gene expression cassette.
F2) The expression cassette containing a nucleic acid molecule encoding an ecTadA protein (ecTadA gene expression cassette) is a DNA capable of expressing an ecTadA protein in a host cell, and the DNA may include not only a promoter which initiates transcription of the ecTadA gene but also a terminator which terminates transcription of the ecTadA gene. Further, the expression cassette may also include an enhancer sequence. Still further, the expression cassette may contain one or two nucleic acid molecules encoding an ecTadA protein. The recombinant vector containing the ecTadA gene expression cassette can be constructed using an existing expression vector.
The vector may be a plasmid, cosmid, phage or viral vector. In a specific embodiment of the invention, the recombinant vector is specifically a DisSUGs-1 recombinant expression vector, a DisSUGs-2 recombinant expression vector or a DisSUGs-3 recombinant expression vector.
The sequence of the DisSUGs-1 recombinant expression vector is sequence 1. The DisSUGs-1 recombinant expression vector contains four target sequences, and the sequences are shown in a table 1.
The sequence of the DisSUGs-2 recombinant expression vector is that the first three target sequences in the sequence 1 are sequentially and respectively replaced by the following three target sequences: DEP1-T2, ACC, NRT1.1B-T4, and the sequences obtained by keeping other sequences unchanged. The corresponding target sequence information is shown in Table 1.
The sequence of the DisSUGs-3 recombinant expression vector is that the first three target sequences in the sequence 1 are sequentially and respectively replaced by the following three target sequences: SPL14, WRKY45, DELLA, and the other sequences were kept unchanged. The corresponding target sequence information is shown in Table 1.
The microorganism may be a yeast, bacterium, algae or fungus. Wherein the bacterium can be an Agrobacterium, such as Agrobacterium EHA 105. In a specific embodiment of the present invention, the recombinant microorganism is specifically agrobacterium EHA105 containing the Dissugs-1 recombinant expression vector or the Dissugs-2 recombinant expression vector or the Dissugs-3 recombinant expression vector.
The transgenic cell line does not include propagation material.
The kit has the following uses:
m1) enriching the cells with A.G base substitution of the genome target sequence of the organism or organism cells;
m2) preparing a product for enriching cells with A.G base substitution of a target sequence of a genome of an organism or an organism cell;
m3) improving the A.G base replacement efficiency of the genome target sequence of the organism or the organism cell;
m4) preparing a product for improving the A.G base replacement efficiency of the genome target sequence of the organism or the organism cell;
m5) an A.G base substitution in a genomic target sequence of an organism or cell of an organism;
m6) preparing the product of the A.G base substitution in the target sequence of the organism or organism cell.
The sgrnas or biological materials related to the sgrnas also belong to the scope of the present invention.
In order to achieve the above object, the present invention also provides a novel use of the above kit or the above sgRNA or a biological material related to the sgRNA.
The invention provides the use of the above described kit or the above described sgRNA or a biological material related to said sgRNA in any of M1) -M6):
m1) enriching the cells with A.G base substitution of the genome target sequence of the organism or organism cells;
m2) preparing a product for enriching cells with A.G base substitution of a target sequence of a genome of an organism or an organism cell;
m3) improving the A.G base replacement efficiency of the genome target sequence of the organism or the organism cell;
m4) preparing a product for improving the A.G base replacement efficiency of the genome target sequence of the organism or the organism cell;
m5) an A.G base substitution in a genomic target sequence of an organism or cell of an organism;
m6) preparing the product of the A.G base substitution in the target sequence of the organism or organism cell.
In order to achieve the above object, the present invention also provides the method described in N1) or N2) or N3):
n1) A method for enriching cells with A.G base substitutions of a target sequence in the genome of an organism or an organism cell or a method for improving the A.G base substitution efficiency of the target sequence in the genome of an organism or an organism cell, comprising the following steps: introducing the coding gene of the Cas9 nuclease, the DNA molecule of the esgRNA transcribed and targeted to the target gene target sequence, the DNA molecule of the sgRNA transcribed and targeted to the target gene target sequence of the screening agent resistance gene with lost function, the coding gene of the adenine deaminase and the screening agent resistance gene with lost function into an organism or an organism cell so as to express the Cas9 nuclease, the sgRNA and the adenine deaminase; under the guidance of sgRNA of the target sequence of the screening agent resistance gene with the targeted loss of function, the Cas9 nuclease and the adenine deaminase can restore the function of the screening agent resistance gene with the lost function by carrying out A.G base substitution on the target sequence of the screening agent resistance gene with the targeted loss of function, thereby enriching cells with the A.G base substitution of the screening agent resistance gene, and further realizing the enrichment of cells with the A.G base substitution of the target sequence of the target gene of the genome of an organism or an organism cell or improving the A.G base substitution efficiency of the target sequence of the target gene of the genome of the organism or the organism cell;
n2) A method for enriching cells with A.G base substitutions of a target sequence in the genome of an organism or an organism cell or a method for improving the A.G base substitution efficiency of the target sequence in the genome of an organism or an organism cell, comprising the following steps: introducing the Cas9 nuclease, esgRNA targeting a target gene target sequence, sgRNA targeting the loss-of-function screening agent resistance gene target sequence, adenine deaminase and a loss-of-function screening agent resistance gene into an organism or an organism cell; under the guidance of sgRNA of the target sequence of the screening agent resistance gene with the targeted loss of function, the Cas9 nuclease and the adenine deaminase can restore the function of the screening agent resistance gene with the lost function by carrying out A.G base substitution on the target sequence of the screening agent resistance gene with the targeted loss of function, thereby enriching cells with the A.G base substitution of the screening agent resistance gene, and further realizing the enrichment of cells with the A.G base substitution of the target sequence of the target gene of the genome of an organism or an organism cell or improving the A.G base substitution efficiency of the target sequence of the target gene of the genome of the organism or the organism cell;
n3) biological mutant, comprising the following steps: editing the genome of the organism according to the method of N1) or N2) to obtain a biological mutant; the biological mutant is an organism in which A.G base substitution occurs.
In the above method, the gene encoding Cas9 nuclease, the DNA molecule of esgRNA that is transcription-targeted to the target gene sequence, the DNA molecule of sgRNA that is transcription-targeted to the loss-of-function screening agent-resistant gene target sequence, and the gene encoding adenine deaminase in N1) are introduced into an organism or a biological cell via a recombinant vector containing the expression cassette of the gene encoding Cas9 nuclease, the expression cassette of the DNA molecule of esgRNA that is transcription-targeted to the target gene target sequence, the expression cassette of the DNA molecule of sgRNA that is transcription-targeted to the loss-of-function screening agent-resistant gene target sequence, and the expression cassette of the gene encoding adenine deaminase. Each of the above-mentioned expression cassettes may be introduced into an organism or a cell of an organism using the same recombinant expression vector, or may be introduced into an organism or a cell of an organism using two or more recombinant expression vectors.
In a specific embodiment of the present invention, each of the above-described expression cassettes is introduced into an organism or a cell of an organism via the same recombinant expression vector. The expression cassette of the adenine deaminase coding gene in the recombinant expression vector contains two coding genes of adenine deaminase. The recombinant expression vector is specifically the DisSUGs-1 recombinant expression vector, the DisSUGs-2 recombinant expression vector or the DisSUGs-3 recombinant expression vector.
In the kit of parts or the use or the method, the base substitution A.G is mutated to the base G. The base A can be any position in the target sequence.
In the above kit of parts or use or method, the organism is P1) or P2) or P3) or P4):
p1) plants or animals;
p2) monocotyledonous or dicotyledonous plants;
p3) gramineous plants;
p4) rice (e.g., japanese fine rice);
the biological cell is Q1) or Q2) or Q3) or Q4):
q1) plant cells or animal cells;
q2) a monocotyledonous or dicotyledonous plant cell;
q3) a graminaceous plant cell;
q4) Rice cells (e.g., Nipponbare rice cells).
The technical principle of the difference agent of the invention is as follows: the optimized esgRNA is applied to the cell enrichment technology of A.G base replacement, the optimized esgRNA is used for editing a target sequence of an endogenous target gene of a genome, the sgRNA is used for editing a surrogate target sequence of a reporter gene, and the A.G base replacement efficiency of the target sequence of the endogenous target gene is further improved.
The cell enrichment technology principle of the A.G base substitution is as follows: a cell enrichment technique using the inactivated screening agent resistance gene as a reporter gene for A.G base substitution is established, so that cells in which A.G base substitution has occurred on the reporter gene can grow in a medium containing the screening agent, and cells in which A.G base substitution has not occurred cannot grow in a medium containing the screening agent. On the basis of the reporter gene, if A.G base replacement editing is carried out on the endogenous target gene target spot, cells growing in a culture medium containing a screening agent have higher probability of A.G base replacement of the endogenous target gene target spot, so that enrichment of the cells with A.G base replacement of the endogenous target gene target spot is realized, and the A.G base replacement efficiency of the endogenous target gene target spot is improved.
The invention has the following advantages:
1. there are many different types of genes that can be used as reporter genes for cell enrichment in plants by A.G base substitution. Because genetic transformation methods (such as an agrobacterium transformation method and a gene gun transformation method) of various crops have relatively mature and stable screening systems, the genetic transformation methods have more broad spectrum and universality compared with other genetic transformation methods such as a fluorescent reporter gene and an endogenous herbicide resistance gene and the like by using a resistance gene corresponding to a screening agent for transformation as a reporter gene to enrich endogenous mutant cells of a genome.
2. The technical design is simple and convenient, and the agent target and the design form can be more widely applied to resistance genes corresponding to more screening agents so as to meet the requirements of different transformation screening systems of different crops.
3. The differential agent technology is suitable for cell enrichment technologies based on different deaminase-mediated base editors or different Cas9 enzyme-mediated base editors, can realize enrichment of A.G base-substituted cells on the cell level, and greatly improves A.G base substitution efficiency.
Drawings
Fig. 1 is a schematic structural diagram of dispugs as a differential proxy technology bearer.
FIG. 2 is a schematic diagram showing the operation principle of the cell for enriching A.G base substitutions by the differential proxy technology.
FIG. 3 is a comparison of the efficiency of A.G base substitution on target in rice resistance-curing by differential agent technique and conventional technique.
Detailed Description
The present invention is described in further detail below with reference to specific embodiments, which are given for the purpose of illustration only and are not intended to limit the scope of the invention. The experimental procedures in the following examples are conventional unless otherwise specified. Materials, reagents, instruments and the like used in the following examples are commercially available unless otherwise specified. In the following examples, unless otherwise specified, the 1 st position of each nucleotide sequence in the sequence listing is the 5 'terminal nucleotide of the corresponding DNA/RNA, and the last position is the 3' terminal nucleotide of the corresponding DNA/RNA.
Primer pair T1 was composed of primer T1-F: 5'-ctgtcttcggctggtctggg-3' and primer T1-R: 5'-tgccaagcacatcaaacaagtaaa-3', and is used for amplifying target ALS-T4.
Primer pair T2 was composed of primer T2-F: 5'-tctagactgtagtggtgataac-3' and primer T2-R: 5'-tttcttctttctgattaatggcc-3', and is used for amplifying target CDC 48-T3.
Primer pair T3 was composed of primer T3-F: 5'-aatccaccaccaatccaatcc-3' and primer T3-R: 5'-caccatggcgtcgtcgtccg-3', for amplifying the target AAT.
Primer pair T4 was composed of primer T4-F: 5'-tcagcctgcagtactgaattatc-3' and primer T4-R: 5'-gggcctaagtgtgacatacaag-3', and is used for amplifying the target DEP 1-T2.
Primer pair T5 was composed of primer T5-F: 5'-gcattgctggacttcaacc-3' and primer T5-R: 5'-caaaccgtatcgcaatctgag-3', for amplifying the target ACC.
Primer pair T6 was composed of primer T6-F: 5'-agcatatatagcaagccaggttg-3' and primer T6-R: 5'-aataagccactgtgttatgtacgc-3', and is used for amplifying target NRT1.1B-T4.
Primer pair T7 was composed of primer T7-F: 5'-gatgtgttgtttgttgcgattc-3' and primer T7-R: 5'-agtgggcatgatggctagg-3', and is used for amplifying the target SPL 14.
Primer pair T8 was composed of primer T8-F: 5'-ctacagggtcacctacatcgg-3' and primer T8-R: 5'-tgagacgacacatcaacaagg-3', and is used for amplifying target WRKY 45.
Primer pair T9 was composed of primer T9-F: 5'-gaagcgcgagtaccaagaag-3' and primer T9-R: 5'-atccgcttggtgtccctc-3', for amplifying target DELLA.
In the following examples, A.G base substitutions refer to mutations from A to G at any position in the target sequence.
The efficiency of base substitution between A and G was 100% of the number of positive resistant calli in which base substitution between A and G occurred/the total number of positive resistant calli analyzed.
Japanese fine rice: reference documents: the effects of sodium nitroprusside and its photolysis products on the growth of Nippon clear rice seedlings and the expression of 5 hormone marker genes [ J ]. proceedings of university of south Henan (Nature edition), 2017(2): 48-52.; the public is available from the agroforestry academy of sciences of Beijing.
Recovering the culture medium: n6 solid medium containing 200mg/L timentin.
Screening a culture medium: n6 solid medium containing 50mg/L hygromycin.
Example 1 establishment of differential agent technique for EcTadA & ecTadA & Cas9 n-mediated A.G base substitutions
Firstly, establishing a differential agent technology vector for A.G base substitution mediated by EcTadA & ecTadA & Cas9n
The common technical vector for EcTadA & ecTadA & Cas9n (ABE) -mediated A.G base substitutions was named sgRNA-GT.
The differential proxy technology vector for EcTadA & ecTadA & Cas9n (ABE) -mediated A.G base substitutions was named DisSUGs.
schematic diagrams of sgRNA-GT and DisSUGs vectors are shown in fig. 1.
The difference agent technology carrier is different from the common technology carrier in that:
1) the differential agent technology vector modifies the resistance gene of the screening agent in the common technology vector to lose the function, and adds a corresponding agent target sequence in the sgRNA part. Taking the screening agent resistance gene as Hygromycin resistance gene Hygromycin as an example: the screening agent resistance gene in the vector of the common technology is the complete Hygromycin resistance gene Hygromycin. The screening agent resistance gene in the differential agent technology carrier is a Hygromycin resistance gene Hygromycin (Hygromycin) with loss of function-ATG) The Hygromycin resistance gene Hygromycin with lost function is a sequence obtained by removing ATG from the complete Hygromycin resistance gene Hygromycin and adding a surrogate target sequence at the 5' end. Wherein, the surrogate target sequences are as follows: ctcatagcactcaatgcggtTGG (capital letters are PAM sequences).
2) The differential surrogate technology uses optimized esgRNA to edit the endogenous target sequence of the genome and sgRNA to edit the surrogate target sequence of the resistance gene of the screening agent.
Second, the working principle of the EcTadA & ecTadA & Cas9n mediated A.G base substitution differential agent technology
The operation principle of the differential agent technique of A.G base substitution is shown in FIG. 2. Taking the screening agent resistance gene as the Hygromycin resistance gene as an example: in the differential agent technology, after ATG is removed from Hygromycin resistance gene Hygromycin, resistance function is lost, and a plant cannot grow resistance callus in a Hygromycin screening culture medium, when an A.G base replacement system (EcTadA & ecTadA & Cas9n) in the differential agent technology mutates A6 in an agent target sequence to G6 (A at the 6 th base is mutated to base G) under guidance of sgRNA, and ATG is formed, the Hygromycin resistance gene Hygromycin can be normally expressed, resistance function is recovered, and the plant can grow resistance callus in the Hygromycin screening culture medium. Because the cells growing the resistant callus have been subjected to A.G base substitution, the efficiency of the A.G base substitution of the endogenous gene corresponding to the cells is relatively higher, so that the purpose of enriching the A.G base substitution cells is achieved, and the A.G base substitution efficiency of the endogenous target of the plant is improved.
Example 2 construction of EcTadA & ecTadA & Cas9 n-mediated differential agent technology vector and application thereof in rice genome editing
Construction of recombinant expression vector
The recombinant expression vectors in this example are classified into the following two types: DissuGs recombinant expression vectors and sgRNA-GT recombinant expression vectors. The structural schematic diagram of each element of the two recombinant expression vectors is shown in FIG. 1. Each vector is a circular plasmid.
Each recombinant expression vector is divided into three types according to different target sequences, and the following six recombinant expression vectors are in total: DisSUGs-1 recombinant expression vector, DisSUGs-2 recombinant expression vector, DisSUGs-3 recombinant expression vector, sgRNA-GT-1 recombinant expression vector, sgRNA-GT-2 recombinant expression vector and sgRNA-GT-3 recombinant expression vector.
The six recombinant expression vectors are artificially synthesized, and the specific structural descriptions of the six recombinant expression vectors are respectively as follows:
the sequence of the DisSUGs-1 recombinant expression vector is sequence 1 in a sequence table. The nucleotide sequence of OsU6a promoter at the 131-position 596 of the sequence 1, the nucleotide sequence of OsU6b promoter at the 712-position 1044, the nucleotide sequence of OsU6c promoter at the 1160-position 1901 and the nucleotide sequence of OsU3 promoter at the 2017-position 2397; the 597-position 616, the 1045-position 1064 and the 1902-position 1921 are three target sequences of ALS-T4, CDC48-T3 and AAT respectively, and the 2398-position 2417 is a reporter gene surrogate target sequence; the 617-702, 1065-1150 and 1922-2007 positions are esgRNA nucleotide sequences, and the 2418-2493 position is an sgRNA nucleotide sequence. The 2511-4224 th site of the sequence 1 is a nucleotide sequence of an OsUbq3 promoter, the 4234 th site-4734 th site and the 4831 th site-5328 th site are ecTadA coding sequences, and the ecTadA proteins are shown as a coding sequence 2; the 5425-9525 th site of the sequence 1 is a coding sequence of SpCas9n protein, and the coding sequence 3 is SpCas9n protein; the 9682-10014 position of the sequence 1 is a 3' UTR sequence of OsUbq 3; the nucleotide sequence of Nos terminator at positions 10015-10267 of the sequence 1; the 10308-12300 th site of the sequence 1 is the nucleotide sequence of the ZmUbi1 promoter, the 12307-12329 th site is the surrogate target sequence, the 12331-13353 th site is the nucleotide sequence of hygromycin phosphotransferase with the initiation codon removed, and the 13380-13595 th site is the nucleotide sequence of the CaMV35S terminator. Four target sequences in the DisSUGs-1 recombinant expression vector are shown in Table 1, and the targets are ALS-T4, CDC48-T3, AAT and ST1152 surrogate targets respectively.
The sequence of the DisSUGs-2 recombinant expression vector is that the first three target sequences in the sequence 1 are sequentially and respectively replaced by the following three target sequences: DEP1-T2, ACC, NRT1.1B-T4, and the sequences obtained by keeping other sequences unchanged. The corresponding target sequence information is shown in Table 1.
The sequence of the DisSUGs-3 recombinant expression vector is that the first three target sequences in the sequence 1 are sequentially and respectively replaced by the following three target sequences: SPL14, WRKY45, DELLA, and the other sequences were kept unchanged. The corresponding target sequence information is shown in Table 1.
The sequence of the sgRNA-GT-1 recombinant expression vector is obtained by replacing the 12307-13353 position of the sequence 1 with the complete hygromycin phosphotransferase nucleotide sequence shown in the sequence 4 and keeping other sequences unchanged.
The sequence of the sgRNA-GT-2 recombinant expression vector is that the first three target sequences in the sgRNA-GT-1 recombinant expression vector are respectively replaced by the following three target sequences in sequence: DEP1-T2, ACC, NRT1.1B-T4, and sequences obtained by keeping other sequences unchanged. The corresponding target sequence information is shown in Table 1.
The sequence of the sgRNA-GT-3 recombinant expression vector is that the first three target sequences in the sgRNA-GT-1 recombinant expression vector are respectively replaced by the following three target sequences in sequence: SPL14, WRKY45, DELLA, and the other sequences were kept unchanged. The corresponding target sequence information is shown in Table 1.
The target nucleotide sequences of the esgrnas or sgrnas of each vector and the corresponding PAM sequences are shown in table 1.
TABLE 1
Figure BDA0002222283680000101
II, obtaining the rice positive resistance callus
Carrying out operation on the DisSuGs-1 vector, DisSuGs-2 vector, DisSuGs-3 vector, sgRNA-GT-1 vector, sgRNA-GT-2 vector and sgRNA-GT-3 vector obtained in the step one according to the following steps 1-8 respectively:
1. the vector was introduced into Agrobacterium EHA105 (product of Shanghai Diego Biotechnology Ltd., CAT #: AC1010) to obtain recombinant Agrobacterium.
2. Culturing the recombinant Agrobacterium with a medium (YEP medium containing 50. mu.g/ml kanamycin and 25. mu.g/ml rifampicin), shaking at 28 ℃ and 150rpm to OD600At room temperature, centrifuging at 10000rpm for 1min, resuspending the thallus with an infection solution (glucose and sucrose are replaced by N6 liquid culture medium, and the concentrations of glucose and sucrose in the infection solution are 10g/L and 20g/L respectively) and diluting to OD600And the concentration is 0.2, and an agrobacterium tumefaciens infection solution is obtained.
3. The mature seeds of the rice variety Nipponbare are shelled and threshed, placed in a 100mL triangular flask, added with 70% (v/v) ethanol water solution for soaking for 30sec, then placed in 25% (v/v) sodium hypochlorite water solution, sterilized by shaking at 120rpm for 30min, washed with sterile water for 3 times, sucked with filter paper to remove water, and then placed on an N6 solid culture medium with the embryo downwards, and cultured in dark at 28 ℃ for 4-6 weeks, thus obtaining the rice callus.
4. After the step 3 is completed, the rice callus is soaked in an agrobacterium infection solution A (the agrobacterium infection solution A is a liquid obtained by adding acetosyringone into the agrobacterium infection solution, the addition amount of the acetosyringone meets the volume ratio of the acetosyringone to the agrobacterium infection solution of 25 mul: 50ml) and soaked for 10min, and then the rice callus is placed on a culture dish (containing about 200ml of the agrobacterium-free infection solution) paved with two layers of sterilized filter paper and cultured in the dark at 21 ℃ for 1 day.
5. And (4) putting the rice callus obtained in the step (4) on a recovery culture medium, and performing dark culture at 25-28 ℃ for 3 days.
6. And (4) placing the rice callus obtained in the step (5) on a screening culture medium, and performing dark culture at 28 ℃ for 2 weeks.
7. And (4) putting the rice callus obtained in the step (6) on a screening culture medium again, and performing dark culture at 28 ℃ for 2 weeks to obtain the rice resistance callus.
8. Respectively extracting 20-24 genome DNAs of rice resistant calli and taking the genome DNAs as templates, and performing PCR amplification by using a primer pair consisting of a primer F (5'-attatgtagcttgtgcgtttcg-3') and a primer R (5'-gatgaagagcttatcgacgt-3') to obtain PCR amplification products; the PCR amplification product was subjected to agarose gel electrophoresis, followed by judgment as follows: if the PCR amplification product contains about 1150bp DNA fragment, the corresponding rice resistant callus is rice positive resistant callus; if the PCR amplification product does not contain the DNA fragment of about 1150bp, the corresponding rice resistant callus is not a rice positive resistant callus.
Third, result analysis
1. Taking 20-24 rice positive resistant callus genome DNAs obtained in the second step as templates (independently infecting twice to obtain an average value and a variance) for each vector, and carrying out PCR amplification on the ALS-T4 target by adopting a primer pair T1 to obtain a PCR amplification product; for CDC48-T3 target, carrying out PCR amplification on T2 by using a primer to obtain a PCR amplification product; for the AAT target, adopting a primer pair T3 to carry out PCR amplification to obtain a PCR amplification product; for the DEP1-T2 target, carrying out PCR amplification on T4 by adopting a primer pair to obtain a PCR amplification product; for the ACC target, carrying out PCR amplification on T5 by using a primer pair to obtain a PCR amplification product; for NRT1.1B-T4 target, carrying out PCR amplification on T6 by adopting a primer pair to obtain a PCR amplification product; for the SPL14 target, carrying out PCR amplification on T7 by adopting a primer pair to obtain a PCR amplification product; for WRKY45 target, carrying out PCR amplification on T8 by using a primer to obtain a PCR amplification product; for DELLA target, PCR amplification is carried out by using a primer pair T9 to obtain a PCR amplification product.
2. And (3) carrying out Sanger sequencing and analysis on the PCR amplification product obtained in the step (1). The sequencing results were analyzed only for each target region. The number of the rice positive resistant calluses with A.G base substitution at each target point of each carrier is respectively counted, the A.G base substitution efficiency is calculated, and the result is shown in figure 3.
The results show that: by using a differential proxy technique, the A.G base substitution efficiency of the 5 th base in the ALS-T4 target point is increased from 34% to 93% in the rice resistance healing wound; the A.G base replacement efficiency of the 5 th base of CDC48-T3 target point is increased from 36% to 80%, and the A.G base replacement efficiency of the 9 th base is increased from 0% to 25%; the A.G base replacement efficiency of the 6 th base of the AAT target point is increased from 22 percent to 53 percent; the A.G base substitution efficiency of the 4 th base of DEP1-T2 increased from 21% to 63%; the A.G base replacement efficiency of the 8 th base of the NRT1.1B-T4 target point is increased from 0% to 9%; the substitution efficiency of the A.G base of the SPL14 target site 5 th base is increased from 20% to 90%, and the substitution efficiency of the A.G base of the 7 th base is increased from 18% to 88%; the efficiency of A.G base substitution at base 6 of DELLA target increased from 31% to 95%. In conclusion, the A.G base substitution efficiency of most targets is improved to 2.5-3 times that of the common technology by using the differential agent technology.
The present invention has been described in detail above. It will be apparent to those skilled in the art that the invention can be practiced in a wide range of equivalent parameters, concentrations, and conditions without departing from the spirit and scope of the invention and without undue experimentation. While the invention has been described with reference to specific embodiments, it will be appreciated that the invention can be further modified. In general, this application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. The use of some of the essential features is possible within the scope of the claims attached below.
Sequence listing
<110> agriculture and forestry academy of sciences of Beijing City
<120> application of differential agent technology in enrichment of cells by A.G base substitution
<160>5
<170>PatentIn version 3.5
<210>1
<211>20001
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>1
ggtggcagga tatattgtgg tgtaaacatg gcactagcct caccgtcttc gcagacgagg 60
ccgctaagtc gcagctacgc tctcaacggc actgactagg tagtttaaac gtgcacttaa 120
ttaaggtacc tggaatcggc agcaaaggat tttttcctgt agttttccca caaccatttt 180
ttaccatccg aatgatagga taggaaaaat atccaagtga acagtattcc tataaaattc 240
ccgtaaaaag cctgcaatcc gaatgagccc tgaagtctga actagccggt cacctgtaca 300
ggctatcgag atgccataca agagacggta gtaggaacta ggaagacgat ggttgattcg 360
tcaggcgaaa tcgtcgtcct gcagtcgcat ctatgggcct ggacggaata ggggaaaaag 420
ttggccggat aggagggaaa ggcccaggtg cttacgtgcg aggtaggcct gggctctcag 480
cacttcgatt cgttggcacc ggggtaggat gcaatagaga gcaacgttta gtaccacctc 540
gcttagctag agcaaactgg actgccttat atgcgcgggt gctggcttgg ctgccgcctc 600
atgaacattc aggagcgttt cagagctatg ctggaaacag catagcaagt tgaaataagg 660
ctagtccgtt atcaacttga aaaagtggca ccgagtcggt gctttttttt ttgcaagaac 720
gaactaagcc ggacaaaaaa aaaaggagca catatacaaa ccggttttat tcatgaatgg 780
tcacgatgga tgatggggct cagacttgag ctacgaggcc gcaggcgaga gaagcctagt 840
gtgctctctg cttgtttggg ccgtaacgga ggatacggcc gacgagcgtg tactaccgcg 900
cgggatgccg ctgggcgctg cgggggccgt tggatgggga tcggtgggtc gcgggagcgt 960
tgaggggaga caggtttagt accacctcgc ctaccgaaca atgaagaacc caccttataa 1020
ccccgcgcgc tgccgcttgt gttgtagcac ccatgacaat gacagtttca gagctatgct 1080
ggaaacagca tagcaagttg aaataaggct agtccgttat caacttgaaa aagtggcacc 1140
gagtcggtgc tttttttttc tcattagcgg tatgcatgtt ggtagaagtc ggagatgtaa 1200
ataattttca ttatataaaa aaggtacttc gagaaaaata aatgcatacg aattaattct 1260
ttttatgttt tttaaaccaa gtatatagaa tttattgatg gttaaaattt caaaaatatg 1320
acgagagaaa ggttaaacgt acggcatata cttctgaaca gagagggaat atggggtttt 1380
tgttgctccc aacaattctt aagcacgtaa aggaaaaaag cacattatcc acattgtact 1440
tccagagata tgtacagcat tacgtaggta cgttttcttt ttcttcccgg agagatgata 1500
caataatcat gtaaacccag aatttaaaaa atattcttta ctataaaaat tttaattagg 1560
gaacgtatta ttttttacat gacacctttt gagaaagagg gacttgtaat atgggacaaa 1620
tgaacaattt ctaagaaatg ggcatatgac tctcagtaca atggaccaaa ttccctccag 1680
tcggcccagc aatacaaagg gaaagaaatg agggggccca caggccacgg cccacttttc 1740
tccgtggtgg ggagatccag ctagaggtcc ggcccacaag tggcccttgc cccgtgggac 1800
ggtgggattg cagagcgcgt gggcggaaac aacagtttag taccacctcg ctcacgcaac 1860
gacgcgacca cttgcttata agctgctgcg ctgaggctca gcaaggatcc cagccccgtg 1920
agtttcagag ctatgctgga aacagcatag caagttgaaa taaggctagt ccgttatcaa 1980
cttgaaaaag tggcaccgag tcggtgcttt ttttttagga atctttaaac atacgaacag 2040
atcacttaaa gttcttctga agcaacttaa agttatcagg catgcatgga tcttggagga 2100
atcagatgtg cagtcaggga ccatagcaca agacaggcgt cttctactgg tgctaccagc 2160
aaatgctgga agccgggaac actgggtacg ttggaaacca cgtgtgatgt gaaggagtaa 2220
gataaactgt aggagaaaag catttcgtag tgggccatga agcctttcag gacatgtatt 2280
gcagtatggg ccggcccatt acgcaattgg acgacaacaa agactagtat tagtaccacc 2340
tcggctatcc acatagatca aagctggttt aaaagagttg tgcagatgat ccgtggcctc 2400
atagcactca atgcggtgtt ttagagctag aaatagcaag ttaaaataag gctagtccgt 2460
tatcaacttg aaaaagtggc accgagtcgg tgcttttttt ttttaagctt acaaattcgg 2520
gtcaaggcgg aagccagcgc gccaccccac gtcagcaaat acggaggcgc ggggttgacg 2580
gcgtcacccg gtcctaacgg cgaccaacaa accagccaga agaaattaca gtaaaaaaaa 2640
agtaaattgc actttgatcc accttttatt acctaagtct caatttggat cacccttaaa 2700
cctatctttt caatttgggc cgggttgtgg tttggactac catgaacaac ttttcgtcat 2760
gtctaacttc cctttcagca aacatatgaa ccatatatag aggagatcgg ccgtatacta 2820
gagctgatgt gtttaaggtc gttgattgca cgagaaaaaa aaatccaaat cgcaacaata 2880
gcaaatttat ctggttcaaa gtgaaaagat atgtttaaag gtagtccaaa gtaaaactta 2940
tagataataa aatgtggtcc aaagcgtaat tcactcaaaa aaaatcaacg agacgtgtac 3000
caaacggaga caaacggcat cttctcgaaa tttcccaacc gctcgctcgc ccgcctcgtc 3060
ttcccggaaa ccgcggtggt ttcagcgtgg cggattctcc aagcagacgg agacgtcacg 3120
gcacgggact cctcccacca cccaaccgcc ataaatacca gccccctcat ctcctctcct 3180
cgcatcagct ccacccccga aaaatttctc cccaatctcg cgaggctctc gtcgtcgaat 3240
cgaatcctct cgcgtcctca aggtacgctg cttctcctct cctcgcttcg tttcgattcg 3300
atttcggacg ggtgaggttg ttttgttgct agatccgatt ggtggttagg gttgtcgatg 3360
tgattatcgt gagatgttta ggggttgtag atctgatggt tgtgatttgg gcacggttgg 3420
ttcgataggt ggaatcgtgg ttaggttttg ggattggatg ttggttctga tgattggggg 3480
gaatttttac ggttagatga attgttggat gattcgattg gggaaatcgg tgtagatctg 3540
ttggggaatt gtggaactag tcatgcctga gtgattggtg cgatttgtag cgtgttccat 3600
cttgtaggcc ttgttgcgag catgttcaga tctactgttc cgctcttgat tgagttattg 3660
gtgccatggg ttggtgcaaa cacaggcttt aatatgttat atctgttttg tgtttgatgt 3720
agatctgtag ggtagttctt cttagacatg gttcaattat gtagcttgtg cgtttcgatt 3780
tgatttcata tgttcacaga ttagataatg atgaactctt ttaattaatt gtcaatggta 3840
aataggaagt cttgtcgcta tatctgtcat aatgatctca tgttactatc tgccagtaat 3900
ttatgctaag aactatatta gaatatcatg ttacaatctg tagtaatatc atgttacaat 3960
ctgtagttca tctatataat ctattgtggt aatttctttt tactatctgt gtgaagatta 4020
ttgccactag ttcattctac ttatttctga agttcaggat acgtgtgctg ttactaccta 4080
tctgaataca tgtgtgatgt gcctgttact atctttttga atacatgtat gttctgttgg 4140
aatatgtttg ctgtttgatc cgttgttgtg tccttaatct tgtgctagtt cttaccctat 4200
ctgtttggtg attatttctt gcagtacgta agcatgtccg aggtggagtt ctcccacgag 4260
tactggatga ggcacgcact caccctcgca aagagggcat gggacgagag ggaggtgcct 4320
gtgggagcag tgctcgtgca caacaacagg gtgatcggag agggatggaa caggcctatc 4380
ggaaggcacg accctaccgc acacgcagag atcatggcac tcaggcaggg aggcctcgtg 4440
atgcagaact acaggctcat cgacgccacc ctctacgtga ccctcgagcc ttgcgtgatg 4500
tgcgcaggag ccatgatcca ctccaggatc ggaagggtgg tgttcggagc aagggacgca 4560
aagaccggag cagccggctc cctcatggac gtgctccacc acccgggcat gaaccacagg 4620
gtggagatca ccgagggaat cctcgcagac gagtgcgcag ccctcctctc cgacttcttc 4680
aggatgagga ggcaggagat caaggcccag aagaaggccc agtcctccac cgactccggc 4740
ggctcatcag gcggctcctc cggctccgag acaccgggca cctccgagtc cgccaccccg 4800
gagtcctccg gcggctcctc cggcggctcc tccgaggtgg agttctccca cgagtactgg 4860
atgaggcacg cactcaccct cgcaaagagg gcaagggacg agagggaggt gcctgtggga 4920
gcagtgctcg tgctcaacaa cagggtgatc ggagagggat ggaacagggc aatcggcctc 4980
cacgacccta ccgcacacgc agagatcatg gcactcaggc agggaggcct cgtgatgcag 5040
aactacaggc tcatcgacgc caccctctac gtgaccttcg agccttgcgt gatgtgcgca 5100
ggagccatga tccactccag gatcggcagg gtggtgttcg gcgtgaggaa cgcaaagacc 5160
ggagcagcag gctccctcat ggacgtgctc cactacccgg gcatgaacca cagggtggag 5220
atcaccgagg gaatcctcgc agacgagtgc gcagccctcc tctgctactt cttcaggatg 5280
ccgaggcagg tgttcaacgc ccagaagaag gcccagtcct ccaccgactc cggcggctca 5340
tcaggcggct cctccggctc cgagacaccg ggcacctccg agtccgccac cccggagtcc 5400
tccggcggct cctccggcgg ctccgacaag aagtactcca tcggcctcgc catcggcacc 5460
aacagcgtcg gctgggcggt gatcaccgac gagtacaagg tcccgtccaa gaagttcaag 5520
gtcctgggca acaccgaccg ccactccatc aagaagaacc tcatcggcgc cctcctcttc 5580
gactccggcg agacggcgga ggcgacccgc ctcaagcgca ccgcccgccg ccgctacacc 5640
cgccgcaaga accgcatctg ctacctccag gagatcttct ccaacgagat ggcgaaggtc 5700
gacgactcct tcttccaccg cctcgaggag tccttcctcg tggaggagga caagaagcac 5760
gagcgccacc ccatcttcgg caacatcgtc gacgaggtcg cctaccacga gaagtacccc 5820
actatctacc accttcgtaa gaagcttgtt gactctactg ataaggctga tcttcgtctc 5880
atctaccttg ctctcgctca catgatcaag ttccgtggtc acttccttat cgagggtgac 5940
cttaaccctg ataactccga cgtggacaag ctcttcatcc agctcgtcca gacctacaac 6000
cagctcttcg aggagaaccc tatcaacgct tccggtgtcg acgctaaggc gatcctttcc 6060
gctaggctct ccaagtccag gcgtctcgag aacctcatcg cccagctccc tggtgagaag 6120
aagaacggtc ttttcggtaa cctcatcgct ctctccctcg gtctgacccc taacttcaag 6180
tccaacttcg acctcgctga ggacgctaag cttcagctct ccaaggatac ctacgacgat 6240
gatctcgaca acctcctcgc tcagattgga gatcagtacg ctgatctctt ccttgctgct 6300
aagaacctct ccgatgctat cctcctttcg gatatcctta gggttaacac tgagatcact 6360
aaggctcctc tttctgcttc catgatcaag cgctacgacg agcaccacca ggacctcacc 6420
ctcctcaagg ctcttgttcg tcagcagctc cccgagaagt acaaggagat cttcttcgac 6480
cagtccaaga acggctacgc cggttacatt gacggtggag ctagccagga ggagttctac 6540
aagttcatca agccaatcct tgagaagatg gatggtactg aggagcttct cgttaagctt 6600
aaccgtgagg acctccttag gaagcagagg actttcgata acggctctat ccctcaccag 6660
atccaccttg gtgagcttca cgccatcctt cgtaggcagg aggacttcta ccctttcctc 6720
aaggacaacc gtgagaagat cgagaagatc cttactttcc gtattcctta ctacgttggt 6780
cctcttgctc gtggtaactc ccgtttcgct tggatgacta ggaagtccga ggagactatc 6840
accccttgga acttcgagga ggttgttgac aagggtgctt ccgcccagtc cttcatcgag 6900
cgcatgacca acttcgacaa gaacctcccc aacgagaagg tcctccccaa gcactccctc 6960
ctctacgagt acttcacggt ctacaacgag ctcaccaagg tcaagtacgt caccgagggt 7020
atgcgcaagc ctgccttcct ctccggcgag cagaagaagg ctatcgttga cctcctcttc 7080
aagaccaacc gcaaggtcac cgtcaagcag ctcaaggagg actacttcaa gaagatcgag 7140
tgcttcgact ccgtcgagat cagcggcgtt gaggaccgtt tcaacgcttc tctcggtacc 7200
taccacgatc tcctcaagat catcaaggac aaggacttcc tcgacaacga ggagaacgag 7260
gacatcctcg aggacatcgt cctcactctt actctcttcg aggataggga gatgatcgag 7320
gagaggctca agacttacgc tcatctcttc gatgacaagg ttatgaagca gctcaagcgt 7380
cgccgttaca ccggttgggg taggctctcc cgcaagctca tcaacggtat cagggataag 7440
cagagcggca agactatcct cgacttcctc aagtctgatg gtttcgctaa caggaacttc 7500
atgcagctca tccacgatga ctctcttacc ttcaaggagg atattcagaa ggctcaggtg 7560
tccggtcagg gcgactctct ccacgagcac attgctaacc ttgctggttc ccctgctatc 7620
aagaagggca tccttcagac tgttaaggtt gtcgatgagc ttgtcaaggt tatgggtcgt 7680
cacaagcctg agaacatcgt catcgagatg gctcgtgaga accagactac ccagaagggt 7740
cagaagaact cgagggagcg catgaagagg attgaggagg gtatcaagga gcttggttct 7800
cagatcctta aggagcaccc tgtcgagaac acccagctcc agaacgagaa gctctacctc 7860
tactacctcc agaacggtag ggatatgtac gttgaccagg agctcgacat caacaggctt 7920
tctgactacg acgtcgacca cattgttcct cagtctttcc ttaaggatga ctccatcgac 7980
aacaaggtcc tcacgaggtc cgacaagaac aggggtaagt cggacaacgt cccttccgag 8040
gaggttgtca agaagatgaa gaactactgg aggcagcttc tcaacgctaa gctcattacc 8100
cagaggaagt tcgacaacct cacgaaggct gagaggggtg gcctttccga gcttgacaag 8160
gctggtttca tcaagaggca gcttgttgag acgaggcaga ttaccaagca cgttgctcag 8220
atcctcgatt ctaggatgaa caccaagtac gacgagaacg acaagctcat ccgcgaggtc 8280
aaggtgatca ccctcaagtc caagctcgtc tccgacttcc gcaaggactt ccagttctac 8340
aaggtccgcg agatcaacaa ctaccaccac gctcacgatg cttaccttaa cgctgtcgtt 8400
ggtaccgctc ttatcaagaa gtaccctaag cttgagtccg agttcgtcta cggtgactac 8460
aaggtctacg acgttcgtaa gatgatcgcc aagtccgagc aggagatcgg caaggccacc 8520
gccaagtact tcttctactc caacatcatg aacttcttca agaccgagat caccctcgcc 8580
aacggcgaga tccgcaagcg ccctcttatc gagacgaacg gtgagactgg tgagatcgtt 8640
tgggacaagg gtcgcgactt cgctactgtt cgcaaggtcc tttctatgcc tcaggttaac 8700
atcgtcaaga agaccgaggt ccagaccggt ggcttctcca aggagtctat ccttccaaag 8760
agaaactcgg acaagctcat cgctaggaag aaggattggg accctaagaa gtacggtggt 8820
ttcgactccc ctactgtcgc ctactccgtc ctcgtggtcg ccaaggtgga gaagggtaag 8880
tcgaagaagc tcaagtccgt caaggagctc ctcggcatca ccatcatgga gcgctcctcc 8940
ttcgagaaga acccgatcga cttcctcgag gccaagggct acaaggaggt caagaaggac 9000
ctcatcatca agctccccaa gtactctctt ttcgagctcg agaacggtcg taagaggatg 9060
ctggcttccg ctggtgagct ccagaagggt aacgagcttg ctcttccttc caagtacgtg 9120
aacttcctct acctcgcctc ccactacgag aagctcaagg gttcccctga ggataacgag 9180
cagaagcagc tcttcgtgga gcagcacaag cactacctcg acgagatcat cgagcagatc 9240
tccgagttct ccaagcgcgt catcctcgct gacgctaacc tcgacaaggt cctctccgcc 9300
tacaacaagc accgcgacaa gcccatccgc gagcaggccg agaacatcat ccacctcttc 9360
acgctcacga acctcggcgc ccctgctgct ttcaagtact tcgacaccac catcgacagg 9420
aagcgttaca cgtccaccaa ggaggttctc gacgctactc tcatccacca gtccatcacc 9480
ggtctttacg agactcgtat cgacctttcc cagcttggtg gtgatgacga tgacaaaatg 9540
gcaccgaaga aaaaaaggaa ggtcggcggc tccccgaaga aaaaaaggaa ggtcggcggc 9600
tccccgaaga aaaaaaggaa ggtcggcggc tccccgaaga aaaaaaggaa ggtcggaatc 9660
catggcgttc catagactag ttcagccagt ttggtggagc tgccgatgtg cctggtcgtc 9720
ccgagcctct gttcgtcaag tatttgtggt gctgatgtct acttgtgtct ggtttaatgg 9780
accatcgagt ccgtatgata tgttagtttt atgaaacagt ttcctgtggg acagcagtat 9840
gctttatgaa taagttggat ttgaacctaa atatgtgctc aatttgctca tttgcatctc 9900
attcctgttg atgttttatc tgagttgcaa gtttgaaaat gctgcatatt cttattaaat 9960
cgtcatttac ttttatctta atgagctttg caatggccta tgggatataa aagagatcgt 10020
tcaaacattt ggcaataaag tttcttaaga ttgaatcctg ttgccggtct tgcgatgatt 10080
atcatataat ttctgttgaa ttacgttaag catgtaataa ttaacatgta atgcatgacg 10140
ttatttatga gatgggtttt tatgattaga gtcccgcaat tatacattta atacgcgata 10200
gaaaacaaaa tatagcgcgc aaactaggat aaattatcgc gcgcggtgtc atctatgtta 10260
ctagatcggc gcctgtccgg gcgcgcctgg tggatcgtcc gcctaggctg cagtgcagcg 10320
tgacccggtc gtgcccctct ctagagataa tgagcattgc atgtctaagt tataaaaaat 10380
taccacatat tttttttgtc acacttgttt gaagtgcagt ttatctatct ttatacatat 10440
atttaaactt tactctacga ataatataat ctatagtact acaataatat cagtgtttta 10500
gagaatcata taaatgaaca gttagacatg gtctaaagga caattgagta ttttgacaac 10560
aggactctac agttttatct ttttagtgtg catgtgttct cctttttttt tgcaaatagc 10620
ttcacctata taatacttca tccattttat tagtacatcc atttagggtt tagggttaat 10680
ggtttttata gactaatttt tttagtacat ctattttatt ctattttagc ctctaaatta 10740
agaaaactaa aactctattt tagttttttt atttaataat ttagatataa aatagaataa 10800
aataaagtga ctaaaaatta aacaaatacc ctttaagaaa ttaaaaaaac taaggaaaca 10860
tttttcttgt ttcgagtaga taatgccagc ctgttaaacg ccgtcgacga gtctaacgga 10920
caccaaccag cgaaccagca gcgtcgcgtc gggccaagcg aagcagacgg cacggcatct 10980
ctgtcgctgc ctctggaccc ctctcgagag ttccgctcca ccgttggact tgctccgctg 11040
tcggcatcca gaaattgcgt ggcggagcgg cagacgtgag ccggcacggc aggcggcctc 11100
ctcctcctct cacggcaccg gcagctacgg gggattcctt tcccaccgct ccttcgcttt 11160
cccttcctcg cccgccgtaa taaatagaca ccccctccac accctctttc cccaacctcg 11220
tgttgttcgg agcgcacaca cacacaacca gatctccccc aaatccaccc gtcggcacct 11280
ccgcttcaag gtacgccgct cgtcctcccc ccccccccct ctctaccttc tctagatcgg 11340
cgttccggtc catggttagg gcccggtagt tctacttctg ttcatgtttg tgttagatcc 11400
gtgtttgtgt tagatccgtg ctgctagcgt tcgtacacgg atgcgacctg tacgtcagac 11460
acgttctgat tgctaacttg ccagtgtttc tctttgggga atcctgggat ggctctagcc 11520
gttccgcaga cgggatcgat ttcatgattt tttttgtttc gttgcatagg gtttggtttg 11580
cccttttcct ttatttcaat atatgccgtg cacttgtttg tcgggtcatc ttttcatgct 11640
tttttttgtc ttggttgtga tgatgtggtc tggttgggcg gtcgttctag atcggagtag 11700
aattctgttt caaactacct ggtggattta ttaattttgg atctgtatgt gtgtgccata 11760
catattcata gttacgaatt gaagatgatg gatggaaata tcgatctagg ataggtatac 11820
atgttgatgc gggttttact gatgcatata cagagatgct ttttgttcgc ttggttgtga 11880
tgatgtggtg tggttgggcg gtcgttcatt cgttctagat cggagtagaa tactgtttca 11940
aactacctgg tgtatttatt aattttggaa ctgtatgtgt gtgtcataca tcttcatagt 12000
tacgagttta agatggatgg aaatatcgat ctaggatagg tatacatgtt gatgtgggtt 12060
ttactgatgc atatacatga tggcatatgc agcatctatt catatgctct aaccttgagt 12120
acctatctat tataataaac aagtatgttt tataattatt ttgatcttga tatacttgga 12180
tgatggcata tgcagcagct atatgtggat ttttttagcc ctgccttcat acgctattta 12240
tttgcttggt actgtttctt ttgtcgatgc tcaccctgtt gtttggtgtt acttctgcag 12300
gagctcctca tagcactcaa tgcggttggc aaaaagcctg aactcaccgc gacgtctgtc 12360
gagaagtttc tgatcgaaaa gttcgacagc gtctccgacc tgatgcagct ctcggagggc 12420
gaagaatctc gtgctttcag cttcgatgta ggagggcgtg gatatgtcct gcgggtaaat 12480
agctgcgccg atggtttcta caaagatcgt tatgtttatc ggcactttgc atcggccgcg 12540
ctcccgattc cggaagtgct tgacattggg gagtttagcg agagcctgac ctattgcatc 12600
tcccgccgtt cacagggtgt cacgttgcaa gacctgcctg aaaccgaact gcccgctgtt 12660
ctacaaccgg tcgcggaggc tatggatgcg atcgctgcgg ccgatcttag ccagacgagc 12720
gggttcggcc cattcggacc gcaaggaatc ggtcaataca ctacatggcg tgatttcata 12780
tgcgcgattg ctgatcccca tgtgtatcac tggcaaactg tgatggacga caccgtcagt 12840
gcgtccgtcg cgcaggctct cgatgagctg atgctttggg ccgaggactg ccccgaagtc 12900
cggcacctcg tgcacgcgga tttcggctcc aacaatgtcc tgacggacaa tggccgcata 12960
acagcggtca ttgactggag cgaggcgatg ttcggggatt cccaatacga ggtcgccaac 13020
atcttcttct ggaggccgtg gttggcttgt atggagcagc agacgcgcta cttcgagcgg 13080
aggcatccgg agcttgcagg atcgccacga ctccgggcgt atatgctccg cattggtctt 13140
gaccaactct atcagagctt ggttgacggc aatttcgatg atgcagcttg ggcgcagggt 13200
cgatgcgacg caatcgtccg atccggagcc gggactgtcg ggcgtacaca aatcgcccgc 13260
agaagcgcgg ccgtctggac cgatggctgt gtagaagtac tcgccgatag tggaaaccga 13320
cgccccagca ctcgtccgag ggcaaagaaa tagagtagat gccgaccggg atctgtcgat 13380
cgacaagctc gagtttctcc ataataatgt gtgagtagtt cccagataag ggaattaggg 13440
ttcctatagg gtttcgctca tgtgttgagc atataagaaa cccttagtat gtatttgtat 13500
ttgtaaaata cttctatcaa taaaatttct aattcctaaa accaaaatcc agtactaaaa 13560
tccagatccc ccgaattaat tcggcgttaa ttcagcctgc aggacgcgtt taattaagtg 13620
cacgcggccg cctacttagt caagagcctc gcacgcgact gtcacgcggc caggatcgcc 13680
tcgtgagcct cgcaatctgt acctagtgtt taaactatca gtgtttgaca ggatatattg 13740
gcgggtaaac ctaagagaaa agagcgttta ttagaataac ggatatttaa aagggcgtga 13800
aaaggtttat ccgttcgtcc atttgtatgt gcatgccaac cacagggttc ccctcgggat 13860
caaagtactt tgatccaacc cctccgctgc tatagtgcag tcggcttctg acgttcagtg 13920
cagccgtctt ctgaaaacga catgtcgcac aagtcctaag ttacgcgaca ggctgccgcc 13980
ctgccctttt cctggcgttt tcttgtcgcg tgttttagtc gcataaagta gaatacttgc 14040
gactagaacc ggagacatta cgccatgaac aagagcgccg ccgctggcct gctgggctat 14100
gcccgcgtca gcaccgacga ccaggacttg accaaccaac gggccgaact gcacgcggcc 14160
ggctgcacca agctgttttc cgagaagatc accggcacca ggcgcgaccg cccggagctg 14220
gccaggatgc ttgaccacct acgccctggc gacgttgtga cagtgaccag gctagaccgc 14280
ctggcccgca gcacccgcga cctactggac attgccgagc gcatccagga ggccggcgcg 14340
ggcctgcgta gcctggcaga gccgtgggcc gacaccacca cgccggccgg ccgcatggtg 14400
ttgaccgtgt tcgccggcat tgccgagttc gagcgttccc taatcatcga ccgcacccgg 14460
agcgggcgcg aggccgccaa ggcccgaggc gtgaagtttg gcccccgccc taccctcacc 14520
ccggcacaga tcgcgcacgc ccgcgagctg atcgaccagg aaggccgcac cgtgaaagag 14580
gcggctgcac tgcttggcgt gcatcgctcg accctgtacc gcgcacttga gcgcagcgag 14640
gaagtgacgc ccaccgaggc caggcggcgc ggtgccttcc gtgaggacgc attgaccgag 14700
gccgacgccc tggcggccgc cgagaatgaa cgccaagagg aacaagcatg aaaccgcacc 14760
aggacggcca ggacgaaccg tttttcatta ccgaagagat cgaggcggag atgatcgcgg 14820
ccgggtacgt gttcgagccg cccgcgcacg tctcaaccgt gcggctgcat gaaatcctgg 14880
ccggtttgtc tgatgccaag ctggcggcct ggccggccag cttggccgct gaagaaaccg 14940
agcgccgccg tctaaaaagg tgatgtgtat ttgagtaaaa cagcttgcgt catgcggtcg 15000
ctgcgtatat gatgcgatga gtaaataaac aaatacgcaa ggggaacgca tgaaggttat 15060
cgctgtactt aaccagaaag gcgggtcagg caagacgacc atcgcaaccc atctagcccg 15120
cgccctgcaa ctcgccgggg ccgatgttct gttagtcgat tccgatcccc agggcagtgc 15180
ccgcgattgg gcggccgtgc gggaagatca accgctaacc gttgtcggca tcgaccgccc 15240
gacgattgac cgcgacgtga aggccatcgg ccggcgcgac ttcgtagtga tcgacggagc 15300
gccccaggcg gcggacttgg ctgtgtccgc gatcaaggca gccgacttcg tgctgattcc 15360
ggtgcagcca agcccttacg acatatgggc caccgccgac ctggtggagc tggttaagca 15420
gcgcattgag gtcacggatg gaaggctaca agcggccttt gtcgtgtcgc gggcgatcaa 15480
aggcacgcgc atcggcggtg aggttgccga ggcgctggcc gggtacgagc tgcccattct 15540
tgagtcccgt atcacgcagc gcgtgagcta cccaggcact gccgccgccg gcacaaccgt 15600
tcttgaatca gaacccgagg gcgacgctgc ccgcgaggtc caggcgctgg ccgctgaaat 15660
taaatcaaaa ctcatttgag ttaatgaggt aaagagaaaa tgagcaaaag cacaaacacg 15720
ctaagtgccg gccgtccgag cgcacgcagc agcaaggctg caacgttggc cagcctggca 15780
gacacgccag ccatgaagcg ggtcaacttt cagttgccgg cggaggatca caccaagctg 15840
aagatgtacg cggtacgcca aggcaagacc attaccgagc tgctatctga atacatcgcg 15900
cagctaccag agtaaatgag caaatgaata aatgagtaga tgaattttag cggctaaagg 15960
aggcggcatg gaaaatcaag aacaaccagg caccgacgcc gtggaatgcc ccatgtgtgg 16020
aggaacgggc ggttggccag gcgtaagcgg ctgggttgtc tgccggccct gcaatggcac 16080
tggaaccccc aagcccgagg aatcggcgtg acggtcgcaa accatccggc ccggtacaaa 16140
tcggcgcggc gctgggtgat gacctggtgg agaagttgaa ggccgcgcag gccgcccagc 16200
ggcaacgcat cgaggcagaa gcacgccccg gtgaatcgtg gcaagcggcc gctgatcgaa 16260
tccgcaaaga atcccggcaa ccgccggcag ccggtgcgcc gtcgattagg aagccgccca 16320
agggcgacga gcaaccagat tttttcgttc cgatgctcta tgacgtgggc acccgcgata 16380
gtcgcagcat catggacgtg gccgttttcc gtctgtcgaa gcgtgaccga cgagctggcg 16440
aggtgatccg ctacgagctt ccagacgggc acgtagaggt ttccgcaggg ccggccggca 16500
tggccagtgt gtgggattac gacctggtac tgatggcggt ttcccatcta accgaatcca 16560
tgaaccgata ccgggaaggg aagggagaca agcccggccg cgtgttccgt ccacacgttg 16620
cggacgtact caagttctgc cggcgagccg atggcggaaa gcagaaagac gacctggtag 16680
aaacctgcat tcggttaaac accacgcacg ttgccatgca gcgtacgaag aaggccaaga 16740
acggccgcct ggtgacggta tccgagggtg aagccttgat tagccgctac aagatcgtaa 16800
agagcgaaac cgggcggccg gagtacatcg agatcgagct agctgattgg atgtaccgcg 16860
agatcacaga aggcaagaac ccggacgtgc tgacggttca ccccgattac tttttgatcg 16920
atcccggcat cggccgtttt ctctaccgcc tggcacgccg cgccgcaggc aaggcagaag 16980
ccagatggtt gttcaagacg atctacgaac gcagtggcag cgccggagag ttcaagaagt 17040
tctgtttcac cgtgcgcaag ctgatcgggt caaatgacct gccggagtac gatttgaagg 17100
aggaggcggg gcaggctggc ccgatcctag tcatgcgcta ccgcaacctg atcgagggcg 17160
aagcatccgc cggttcctaa tgtacggagc agatgctagg gcaaattgcc ctagcagggg 17220
aaaaaggtcg aaaaggtctc tttcctgtgg atagcacgta cattgggaac ccaaagccgt 17280
acattgggaa ccggaacccg tacattggga acccaaagcc gtacattggg aaccggtcac 17340
acatgtaagt gactgatata aaagagaaaa aaggcgattt ttccgcctaa aactctttaa 17400
aacttattaa aactcttaaa acccgcctgg cctgtgcata actgtctggc cagcgcacag 17460
ccgaagagct gcaaaaagcg cctacccttc ggtcgctgcg ctccctacgc cccgccgctt 17520
cgcgtcggcc tatcgcggcc gctggccgct caaaaatggc tggcctacgg ccaggcaatc 17580
taccagggcg cggacaagcc gcgccgtcgc cactcgaccg ccggcgccca catcaaggca 17640
ccctgcctcg cgcgtttcgg tgatgacggt gaaaacctct gacacatgca gctcccggag 17700
acggtcacag cttgtctgta agcggatgcc gggagcagac aagcccgtca gggcgcgtca 17760
gcgggtgttg gcgggtgtcg gggcgcagcc atgacccagt cacgtagcga tagcggagtg 17820
tatactggct taactatgcg gcatcagagc agattgtact gagagtgcac catatgcggt 17880
gtgaaatacc gcacagatgc gtaaggagaa aataccgcat caggcgctct tccgcttcct 17940
cgctcactga ctcgctgcgc tcggtcgttc ggctgcggcg agcggtatca gctcactcaa 18000
aggcggtaat acggttatcc acagaatcag gggataacgc aggaaagaac atgtgagcaa 18060
aaggccagca aaaggccagg aaccgtaaaa aggccgcgtt gctggcgttt ttccataggc 18120
tccgcccccc tgacgagcat cacaaaaatc gacgctcaag tcagaggtgg cgaaacccga 18180
caggactata aagataccag gcgtttcccc ctggaagctc cctcgtgcgc tctcctgttc 18240
cgaccctgcc gcttaccgga tacctgtccg cctttctccc ttcgggaagc gtggcgcttt 18300
ctcatagctc acgctgtagg tatctcagtt cggtgtaggt cgttcgctcc aagctgggct 18360
gtgtgcacga accccccgtt cagcccgacc gctgcgcctt atccggtaac tatcgtcttg 18420
agtccaaccc ggtaagacac gacttatcgc cactggcagc agccactggt aacaggatta 18480
gcagagcgag gtatgtaggc ggtgctacag agttcttgaa gtggtggcct aactacggct 18540
acactagaag gacagtattt ggtatctgcg ctctgctgaa gccagttacc ttcggaaaaa 18600
gagttggtag ctcttgatcc ggcaaacaaa ccaccgctgg tagcggtggt ttttttgttt 18660
gcaagcagca gattacgcgc agaaaaaaag gatctcaaga agatcctttg atcttttcta 18720
cggggtctga cgctcagtgg aacgaaaact cacgttaagg gattttggtc atgcattcta 18780
ggtactaaaa caattcatcc agtaaaatat aatattttat tttctcccaa tcaggcttga 18840
tccccagtaa gtcaaaaaat agctcgacat actgttcttc cccgatatcc tccctgatcg 18900
accggacgca gaaggcaatg tcataccact tgtccgccct gccgcttctc ccaagatcaa 18960
taaagccact tactttgcca tctttcacaa agatgttgct gtctcccagg tcgccgtggg 19020
aaaagacaag ttcctcttcg ggcttttccg tctttaaaaa atcatacagc tcgcgcggat 19080
ctttaaatgg agtgtcttct tcccagtttt cgcaatccac atcggccaga tcgttattca 19140
gtaagtaatc caattcggct aagcggctgt ctaagctatt cgtataggga caatccgata 19200
tgtcgatgga gtgaaagagc ctgatgcact ccgcatacag ctcgataatc ttttcagggc 19260
tttgttcatc ttcatactct tccgagcaaa ggacgccatc ggcctcactc atgagcagat 19320
tgctccagcc atcatgccgt tcaaagtgca ggacctttgg aacaggcagc tttccttcca 19380
gccatagcat catgtccttt tcccgttcca catcataggt ggtcccttta taccggctgt 19440
ccgtcatttt taaatatagg ttttcatttt ctcccaccag cttatatacc ttagcaggag 19500
acattccttc cgtatctttt acgcagcggt atttttcgat cagttttttc aattccggtg 19560
atattctcat tttagccatt tattatttcc ttcctctttt ctacagtatt taaagatacc 19620
ccaagaagct aattataaca agacgaactc caattcactg ttccttgcat tctaaaacct 19680
taaataccag aaaacagctt tttcaaagtt gttttcaaag ttggcgtata acatagtatc 19740
gacggagccg attttgaaac cgcggtgatc acaggcagca acgctctgtc atcgttacaa 19800
tcaacatgct accctccgcg agatcatccg tgtttcaaac ccggcagctt agttgccgtt 19860
cttccgaata gcatcggtaa catgagcaaa gtctgccgcc ttacaacggc tctcccgctg 19920
acgccgtccc ggactgatgg gctgcctgta tcgagtggtg attttgtgcc gagctgccgg 19980
tcggggagct gttggctggc t 20001
<210>2
<211>167
<212>PRT
<213> Artificial Sequence (Artificial Sequence)
<400>2
Met Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala Leu
1 5 10 15
Thr Leu Ala Lys Arg Ala Trp Asp Glu Arg Glu Val Pro Val Gly Ala
20 25 30
Val Leu Val His Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg Pro
35 40 45
Ile Gly Arg His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu Arg
50 55 60
Gln Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr Leu
65 70 75 80
Tyr Val Thr Leu Glu Pro Cys Val Met Cys Ala Gly Ala Met Ile His
85 90 95
Ser Arg Ile Gly Arg Val Val Phe Gly Ala Arg Asp Ala Lys Thr Gly
100 105 110
Ala Ala Gly Ser Leu Met Asp Val Leu His His Pro Gly Met Asn His
115 120 125
Arg Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala Ala Leu
130 135 140
Leu Ser Asp Phe Phe Arg Met Arg Arg Gln Glu Ile Lys Ala Gln Lys
145 150 155 160
Lys Ala Gln Ser Ser Thr Asp
165
<210>3
<211>1367
<212>PRT
<213> Artificial Sequence (Artificial Sequence)
<400>3
Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val Gly
1 5 10 15
Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys
20 25 30
Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile Gly
35 40 45
Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys
50 55 60
Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys Tyr
65 70 75 80
Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe
85 90 95
Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys His
100 105 110
Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr His
115 120 125
Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser
130 135 140
Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His Met
145 150 155 160
Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp
165 170 175
Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn
180 185 190
Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala Lys
195 200 205
Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu
210 215 220
Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu
225 230 235 240
Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp
245 250 255
Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp
260 265 270
Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu
275 280 285
Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp Ile
290 295 300
Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser Met
305 310 315 320
Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys Ala
325 330 335
Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe Asp
340 345 350
Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser Gln
355 360 365
Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp Gly
370 375 380
Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg Lys
385 390 395 400
Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu Gly
405 410 415
Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe Leu
420 425 430
Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro
435 440 445
Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp Met
450 455 460
Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val
465 470 475 480
Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr Asn
485 490 495
Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser Leu
500 505 510
Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr
515 520 525
Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln Lys
530 535 540
Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr Val
545 550 555 560
Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp Ser
565 570 575
Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly Thr
580 585 590
Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp Asn
595 600 605
Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr Leu
610 615 620
Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala His
625 630 635 640
Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr Thr
645 650 655
Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp Lys
660 665 670
Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe Ala
675 680 685
Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe Lys
690 695 700
Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu His
705 710 715 720
Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile
725 730 735
Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly Arg
740 745 750
His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln Thr
755 760 765
Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile Glu
770 775 780
Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro Val
785 790 795 800
Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu Gln
805 810 815
Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg Leu
820 825 830
Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys Asp
835 840 845
Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg Gly
850 855 860
Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys Asn
865 870 875 880
Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys Phe
885 890 895
Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp Lys
900 905 910
Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr Lys
915 920 925
His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp Glu
930 935 940
Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser Lys
945 950 955 960
Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg Glu
965 970 975
Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val Val
980 985 990
Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe Val
995 1000 1005
Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys
1010 1015 1020
Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr
1025 1030 1035
Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn
1040 1045 1050
Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr
1055 1060 1065
Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg
1070 1075 1080
Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr Glu
1085 1090 1095
Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys Arg
1100 1105 1110
Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys
1115 1120 1125
Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu
1130 1135 1140
Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser
1145 1150 1155
Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe
1160 1165 1170
Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu
1175 1180 1185
Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe
1190 1195 1200
Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly Glu
1205 1210 1215
Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val Asn
1220 1225 1230
Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser Pro
1235 1240 1245
Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His
1250 1255 1260
Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg
1265 1270 1275
Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr
1280 1285 1290
Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile
1295 1300 1305
Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe
1310 1315 1320
Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr
1325 1330 1335
Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr Gly
1340 1345 1350
Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
1355 1360 1365
<210>4
<211>1026
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>4
atgaaaaagc ctgaactcac cgcgacgtct gtcgagaagt ttctgatcga aaagttcgac 60
agcgtctccg acctgatgca gctctcggag ggcgaagaat ctcgtgcttt cagcttcgat 120
gtaggagggc gtggatatgt cctgcgggta aatagctgcg ccgatggttt ctacaaagat 180
cgttatgttt atcggcactt tgcatcggcc gcgctcccga ttccggaagt gcttgacatt 240
ggggagttta gcgagagcct gacctattgc atctcccgcc gttcacaggg tgtcacgttg 300
caagacctgc ctgaaaccga actgcccgct gttctacaac cggtcgcgga ggctatggat 360
gcgatcgctg cggccgatct tagccagacg agcgggttcg gcccattcgg accgcaagga 420
atcggtcaat acactacatg gcgtgatttc atatgcgcga ttgctgatcc ccatgtgtat 480
cactggcaaa ctgtgatgga cgacaccgtc agtgcgtccg tcgcgcaggc tctcgatgag 540
ctgatgcttt gggccgagga ctgccccgaa gtccggcacc tcgtgcacgc ggatttcggc 600
tccaacaatg tcctgacgga caatggccgc ataacagcgg tcattgactg gagcgaggcg 660
atgttcgggg attcccaata cgaggtcgcc aacatcttct tctggaggcc gtggttggct 720
tgtatggagc agcagacgcg ctacttcgag cggaggcatc cggagcttgc aggatcgcca 780
cgactccggg cgtatatgct ccgcattggt cttgaccaac tctatcagag cttggttgac 840
ggcaatttcg atgatgcagc ttgggcgcag ggtcgatgcg acgcaatcgt ccgatccgga 900
gccgggactg tcgggcgtac acaaatcgcc cgcagaagcg cggccgtctg gaccgatggc 960
tgtgtagaag tactcgccga tagtggaaac cgacgcccca gcactcgtcc gagggcaaag 1020
aaatag 1026
<210>5
<211>23
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>5
ctcatagcac tcaatgcggt tgg 23

Claims (9)

1. A kit comprising a sgRNA, an a · G base substitution system, and a loss-of-function screener resistance gene or a biological material associated with the loss-of-function screener resistance gene;
the sgRNA consists of esgRNA targeting a target gene target sequence and sgRNA targeting the loss-of-function screening agent resistance gene target sequence;
the esgRNA structure of the target gene target sequence is as follows: an RNA-esgRNA backbone transcribed from the target gene sequence;
the sgRNA structure of the target sequence of the screening agent resistance gene targeting the loss of function is as follows: an RNA-sgRNA backbone transcribed from the loss-of-function screener resistance gene target sequence;
the sgRNA framework is an RNA molecule obtained by replacing T in the 2418 th 2493 th site of the sequence 1 with U;
the esgRNA framework is an RNA molecule obtained by replacing T in the 617-702 th site of the sequence 1 with U;
the a.g base substitution system comprises Cas9 nuclease or a biological material associated with the Cas9 nuclease and adenine deaminase or a biological material associated with the adenine deaminase;
the biological material related to the Cas9 nuclease is a nucleic acid molecule encoding the Cas9 nuclease or an expression cassette, a recombinant vector, a recombinant microorganism, a transgenic cell line containing the nucleic acid molecule;
the biological material related to the adenine deaminase is a nucleic acid molecule encoding the adenine deaminase or an expression cassette, a recombinant vector, a recombinant microorganism and a transgenic cell line containing the nucleic acid molecule;
the resistance gene of the screening agent is an exogenous resistance gene;
the biological material related to the screening agent resistance gene with the loss of function is a nucleic acid molecule encoding the screening agent resistance gene with the loss of function or an expression cassette, a recombinant vector, a recombinant microorganism and a transgenic cell line containing the nucleic acid molecule;
the screening agent resistance gene with the function loss is a sequence obtained by deleting the initiation codon of the screening agent resistance gene and adding an agent target sequence at the 5' end of the screening agent resistance gene; the A.G base substitution system can restore the function of the screening agent resistance gene with the function lost by carrying out A.G base substitution on the surrogate target sequence under the guidance of sgRNA of the screening agent resistance gene target sequence with the function lost;
the surrogate target sequence is sequence 5.
2. The kit of claim 1, wherein: the screening agent resistance gene is a hygromycin resistance gene.
3. The kit of claim 1, wherein: the Cas9 nuclease is SpCas9n protein;
the adenine deaminase is an ecTadA protein;
the SpCas9n protein is a protein shown as a sequence 3;
the biological material related to the SpCas9n is any one of B1) to B5):
B1) a nucleic acid molecule encoding the SpCas9 n;
B2) an expression cassette comprising the nucleic acid molecule of B1);
B3) a recombinant vector containing the nucleic acid molecule of B1) or a recombinant vector containing the expression cassette of B2);
B4) a recombinant microorganism containing B1) the nucleic acid molecule, or a recombinant microorganism containing B2) the expression cassette, or a recombinant microorganism containing B3) the recombinant vector;
B5) a transgenic cell line comprising B1) the nucleic acid molecule or a transgenic cell line comprising B2) the expression cassette;
the ecTadA protein is a protein shown in sequence 2;
the biological material related to said ecTadA protein is any one of F1) to F5):
F1) a nucleic acid molecule encoding said ecTadA protein;
F2) an expression cassette comprising the nucleic acid molecule of F1);
F3) a recombinant vector comprising the nucleic acid molecule of F1) or a recombinant vector comprising the expression cassette of F2);
F4) a recombinant microorganism containing F1) said nucleic acid molecule, or a recombinant microorganism containing F2) said expression cassette, or a recombinant microorganism containing F3) said recombinant vector;
F5) a transgenic cell line comprising the nucleic acid molecule of F1) or a transgenic cell line comprising the expression cassette of F2).
4. Use of the kit of any one of claims 1 to 3 in any one of M1) -M6):
m1) enriching the cells with A.G base substitution of the genome target sequence of the organism or organism cells;
m2) preparing a product for enriching cells with A.G base substitution of a target sequence of a genome of an organism or an organism cell;
m3) improving the A.G base replacement efficiency of the genome target sequence of the organism or the organism cell;
m4) preparing a product for improving the A.G base replacement efficiency of the genome target sequence of the organism or the organism cell;
m5) an A.G base substitution in a genomic target sequence of an organism or cell of an organism;
m6) preparing the product of the A.G base substitution in the target sequence of the organism or organism cell.
5, N1) or N2) or N3):
n1) A method for enriching cells with A.G base substitutions of target sequences in genomes of organisms or cells of organisms or a method for improving the A.G base substitution efficiency of the target sequences in genomes of organisms or cells of organisms, comprising the following steps: introducing into an organism or cell of an organism a gene encoding a Cas9 nuclease, a DNA molecule transcribing an esgRNA targeting a target gene target sequence, a DNA molecule transcribing an sgRNA targeting the loss-of-function screener resistance gene target sequence, a gene encoding an adenine deaminase, and a loss-of-function screener resistance gene of any one of claims 1-3, such that the Cas9 nuclease, the sgRNA, the adenine deaminase are all expressed; under the guidance of sgRNA of the target sequence of the screening agent resistance gene with the targeted loss of function, the Cas9 nuclease and the adenine deaminase can restore the function of the screening agent resistance gene with the loss of function by carrying out A.G base substitution on the target sequence of the screening agent resistance gene with the targeted loss of function, thereby enriching cells with the A.G base substitution of the screening agent resistance gene, and further realizing the enrichment of cells with the A.G base substitution of the target sequence of the target gene of the genome of an organism or an organism cell or improving the A.G base substitution efficiency of the target sequence of the target gene of the genome of the organism or the organism cell;
n2) A method for enriching cells with A.G base substitutions of target sequences in genomes of organisms or cells of organisms or a method for improving the A.G base substitution efficiency of the target sequences in genomes of organisms or cells of organisms, comprising the following steps: introducing into an organism or cell of an organism a Cas9 nuclease of any one of claims 1-3, an esgRNA targeting a target gene sequence, an sgRNA targeting the loss-of-function screener resistance gene target sequence, an adenine deaminase, and a loss-of-function screener resistance gene; under the guidance of sgRNA of the target sequence of the screening agent resistance gene with the targeted loss of function, the Cas9 nuclease and the adenine deaminase can restore the function of the screening agent resistance gene with the loss of function by carrying out A.G base substitution on the target sequence of the screening agent resistance gene with the targeted loss of function, thereby enriching cells with the A.G base substitution of the screening agent resistance gene, and further realizing the enrichment of cells with the A.G base substitution of the target sequence of the target gene of the genome of an organism or an organism cell or improving the A.G base substitution efficiency of the target sequence of the target gene of the genome of the organism or the organism cell;
n3) biological mutant, comprising the following steps: editing the genome of the organism according to the method of N1) or N2) to obtain an organism mutant; the biological mutant is an organism in which A.G base substitution occurs.
6. The use according to claim 4 or the method according to claim 5, characterized in that: the organism is a plant or an animal; the biological cell is a plant cell or an animal cell.
7. The use or method according to claim 6, wherein: the plant is a monocotyledon or a dicotyledon; the plant cell is a monocotyledon cell or a dicotyledon cell.
8. The use or method according to claim 7, wherein: the monocotyledon is a gramineous plant; the monocotyledon cell is a gramineae plant cell.
9. The use or method according to claim 8, wherein: the gramineous plant is rice; the gramineous plant cell is a rice cell.
CN201910938672.XA 2019-09-30 2019-09-30 Application of differential proxy technology in enrichment of A.G base substitution cells Active CN110669775B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910938672.XA CN110669775B (en) 2019-09-30 2019-09-30 Application of differential proxy technology in enrichment of A.G base substitution cells

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910938672.XA CN110669775B (en) 2019-09-30 2019-09-30 Application of differential proxy technology in enrichment of A.G base substitution cells

Publications (2)

Publication Number Publication Date
CN110669775A CN110669775A (en) 2020-01-10
CN110669775B true CN110669775B (en) 2021-07-16

Family

ID=69080360

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910938672.XA Active CN110669775B (en) 2019-09-30 2019-09-30 Application of differential proxy technology in enrichment of A.G base substitution cells

Country Status (1)

Country Link
CN (1) CN110669775B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114317589B (en) * 2020-09-30 2024-01-16 北京市农林科学院 Application of SpRYn-ABE base editing system in plant genome base substitution
CN114317596B (en) * 2020-09-30 2024-01-16 北京市农林科学院 Method for mutating A in plant genome target sequence into G

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003090515A2 (en) * 2002-03-25 2003-11-06 Applera Corporation Systems and methods for detection of nuclear receptor function using reporter enzyme mutant complementation
CN108795972A (en) * 2017-05-05 2018-11-13 中国科学院遗传与发育生物学研究所 Without using the cellifugal method of transgenosis flag sequence point
CN109295186A (en) * 2018-09-30 2019-02-01 中山大学 A kind of method based on genome sequencing detection adenine single base editing system undershooting-effect and its application in gene editing
CN109306361A (en) * 2018-02-11 2019-02-05 华东师范大学 A kind of gene editing system of new A/T to G/C base fixed point conversion

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003090515A2 (en) * 2002-03-25 2003-11-06 Applera Corporation Systems and methods for detection of nuclear receptor function using reporter enzyme mutant complementation
CN108795972A (en) * 2017-05-05 2018-11-13 中国科学院遗传与发育生物学研究所 Without using the cellifugal method of transgenosis flag sequence point
CN109306361A (en) * 2018-02-11 2019-02-05 华东师范大学 A kind of gene editing system of new A/T to G/C base fixed point conversion
CN109295186A (en) * 2018-09-30 2019-02-01 中山大学 A kind of method based on genome sequencing detection adenine single base editing system undershooting-effect and its application in gene editing

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
"Discriminated sgRNAs-Based SurroGate System Greatly Enhances the Screening Efficiency of Plant Base-Edited Cells";Wen Xu et al.;《Molecular Plant》;20191018;第13卷;第169-180页 *
"Expanded base editing in rice and wheat using a Cas9-adenosine deaminase fusion";Chao Li et al.;《Genome Biology》;20180529;第19卷;第1-9页 *
"Surrogate reporter-based enrichment of cells containing RNA-guided Cas9 nuclease-induced mutations";Suresh Ramakrishna et al.;《NATURE COMMUNICATIONS》;20140226;第5卷;第1-10页和补充实验数据第1-22页 *
"surrogate reporters for enrichment of cells with nuclease-induced mutations";Hyojin Kim et al.;《nature methods》;20111009;第8卷(第11期);第941-944页 *
"碱基编辑器的开发及其在细菌基因组编辑中的应用";赵亚伟 等;《微生物学通报》;20190220;第46卷(第2期);第319-331页 *

Also Published As

Publication number Publication date
CN110669775A (en) 2020-01-10

Similar Documents

Publication Publication Date Title
CN110527697B (en) RNA fixed-point editing technology based on CRISPR-Cas13a
AU2013254857B2 (en) Targeted genome engineering in plants
KR20210149686A (en) Polypeptides useful for gene editing and methods of use
CN108138156A (en) For the method and composition of marker-free group modification
RU2553206C2 (en) Proteins relating to grain shape and leaf shape of rice, genes coding said proteins and uses thereof
CN110669775B (en) Application of differential proxy technology in enrichment of A.G base substitution cells
CN114230677B (en) Recombinant protein containing Cap of hog cholera E2 and circovirus, preparation method and application thereof
JP2004520032A (en) Plastid transformation method and vector therefor
Li et al. TALEN utilization in rice genome modifications
EP2383337B1 (en) Novel promoter for use in transformation of algae
CN110951702B (en) Rice DMNT and TMTT synthesis related protein OsCYP92C21, and coding gene and application thereof
BRPI0616533A2 (en) isolated polynucleotide, isolated nucleic acid fragment, recombinant DNA constructs, plants, seeds, plant cells, plant tissues, nucleic acid fragment isolation method, genetic variation mapping method, molecular cultivation method, corn plants, methods of nitrogen transport of plants and hat variants of altered plants
CN114686456B (en) Base editing system based on bimolecular deaminase complementation and application thereof
CN114438115A (en) CRISPR/Cas9 gene editing vector, construction method and application thereof
CN112239756B (en) Group of cytosine deaminases from plants and their use in base editing systems
JP4312478B2 (en) Improvement of homologous recombination frequency by UVDE expression
US8691505B2 (en) Promoter for use in transformation of algae
CN111304239B (en) Rolling circle replication recombinant vector, construction method and application
US20040163145A1 (en) Integrases for the insertion of heterolgogous nucleic acids into the plastid genome
CN114457088A (en) Event CTC75064-3, insect-resistant sugarcane plants, methods of producing and detecting insect-resistant sugarcane plants
CN114908111B (en) Method and system for continuous cloning of long DNA fragments
CN114621974B (en) Vector of plant single-gene or multi-gene CRISPR (clustered regularly interspaced short palindromic surface plasmon) activation technology, and construction method and application thereof
CN110872596A (en) Construction method of saccharomyces cerevisiae for co-utilizing xylose and arabinose
CN111019957B (en) Upland cotton transformation event ICR24-243 and identification method and application thereof
CN107227312B (en) Method for increasing content of salvianolic acid B in hairy roots of salvia miltiorrhiza and laccase gene

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant