CN113851186A - Construction method of homologous type 2 CRISPR/Cas gene editing system - Google Patents

Construction method of homologous type 2 CRISPR/Cas gene editing system Download PDF

Info

Publication number
CN113851186A
CN113851186A CN202110589533.8A CN202110589533A CN113851186A CN 113851186 A CN113851186 A CN 113851186A CN 202110589533 A CN202110589533 A CN 202110589533A CN 113851186 A CN113851186 A CN 113851186A
Authority
CN
China
Prior art keywords
lys
leu
glu
ile
asp
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110589533.8A
Other languages
Chinese (zh)
Inventor
胡争
崔资凤
田瑞
金庄
黄昭玥
李梦媛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Little Skylark Health Management Co.,Ltd.
Original Assignee
First Affiliated Hospital of Sun Yat Sen University
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by First Affiliated Hospital of Sun Yat Sen University, Sun Yat Sen University filed Critical First Affiliated Hospital of Sun Yat Sen University
Publication of CN113851186A publication Critical patent/CN113851186A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/10Sequence alignment; Homology search
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/20Sequence assembly
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/30Data warehousing; Computing architectures

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Biophysics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Analytical Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Chemical & Material Sciences (AREA)
  • Molecular Biology (AREA)
  • Genetics & Genomics (AREA)
  • Bioethics (AREA)
  • Databases & Information Systems (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention relates to a construction method of homologous type 2 CRISPR/Cas gene editing, belonging to the technical field of gene editing, which is a method for mining a new type 2 homologous CRISPR/Cas gene editing system in a long DNA sequence by deeply analyzing metagenome data from various environments and assembling a short read sequence (100-300) to obtain the long DNA sequence (not less than 4000bp), wherein the content predicted by the method comprises the following steps: the effector Cas protein, an auxiliary protein related to the effector Cas protein, an auxiliary sequence related to the effector Cas protein, and a key sequence for the effector Cas protein. The construction method is easy to manufacture, high in construction efficiency, low in construction cost and strong in universality; the constructed homologous type 2 CRISPR/Cas gene editing system can be used for modifying a targeted genome, can accurately modify the genome and realize a huge gene editing and modifying function.

Description

Construction method of homologous type 2 CRISPR/Cas gene editing system
Technical Field
The invention relates to a construction method of a homologous type 2 CRISPR/Cas gene editing system, belonging to the technical field of gene editing.
Background
Gene editing (gene editing) technology makes it possible to modify specific sites of DNA sequences, such as Zinc Finger Nucleases (ZFNs) which are first generation gene editing tools, and transcription-activated small nucleases (TALENs) which are similar to second generation gene editing tools, can be used for modifying targeted genomes, but these methods are difficult to design, difficult to manufacture, expensive in cost and not strong in universality.
CRISPR (Clustered Regularly Interspaced Short Palindromic Repeat) system is a natural immune system from archaea and bacteria, and is a third generation gene editing tool. The CRISPR/Cas system is different from the conventional gene editing tool (protein-DNA recognition), utilizes guide RNA (single guide RNA, sgRNA) to recognize a target DNA base sequence based on a nucleic acid complementary pairing principle and guides Cas effector protein to perform site-specific cleavage, and has the advantages of strong applicability, simple design, low cost and high efficiency. Cas proteins contain a variety of different effector domains (domains) that play a role in different activities such as nucleic acid recognition, stabilizing complex structures, hydrolyzing DNA phosphodiester bonds, and the like. Among them, the type II CRISPR/Cas9 system derived from Streptococcus pyogenes Cas (SpCas 9) is the most widely used CRISPR/Cas system at present due to its high cleavage efficiency.
Most of the existing CRISPR gene editing systems are the SpCas9 system from Streptococcus pyogenes. The system limits further wide application of the system by identifying and cutting a Protospacer Adjacent Module (PAM) sequence, namely 'NGG', on a targeted polynucleotide, and obtains more homologous CRSIPR/Cas systems with different physicochemical properties and capable of identifying different PAMs becomes a research focus. At the same time, in large and diverse metagenomes, microorganisms that have not been cultured or even discovered are hidden, and there may be a large number of undiscovered CRISPR/Cas protein editing systems whose activity needs to be confirmed in prokaryotes, eukaryotes, and in an in vitro environment.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a construction method of a homologous type 2 CRISPR/Cas gene editing system, and 5 new homologous CRISPR/Cas gene editing systems are predicted by the construction method.
In order to achieve the purpose, the invention adopts the technical scheme that: a construction method of a homologous type 2 CRISPR/Cas gene editing system comprises the following steps:
(1) processing the metagenome sequencing data, and screening to obtain a contig sequence and a protein sequence;
(2) clustering the protein sequences in the step (1) by using software, and then expanding the clustering by using an HMMer software package;
(3) searching for clusters related to the type 2 CRISPR/Cas system in the expanded clusters of step (2) using a known hidden markov model file of type 2 CRISPR/Cas system-related effector proteins;
(4) comparing the clusters related to the type 2 CRISPR/Cas system in the step (3) with a related database, and screening to obtain genes;
(5) performing CRISPR/Cas system prediction on contigs where the genes obtained by screening in the step (4) are located, extracting predicted effector proteins of the type 2 CRISPR/Cas system, and analyzing the structural domains by using software;
(6) and (4) performing comparative genomics and evolution correlation analysis on the structural domain of the effector protein of the type 2 CRISPR/Cas system predicted in the step (5) and the auxiliary protein related to the effector protein of the type 2 CRISPR/Cas system to obtain the homologous type 2 CRISPR/Cas gene editing system.
The inventor of the application obtains a sketch of a long DNA sequence (not less than 4000bp) by deeply analyzing metagenome data from various environments and assembling a short read sequence (100-300bp), and excavates a new type 2 homologous CRISPR/Cas gene editing system in the long DNA sequence, wherein the content predicted by the method comprises the following steps: cas effector proteins (Cas9, Cas12 and Cas13), auxiliary proteins related to the function of the Cas effector proteins, auxiliary sequences related to the function of the Cas effector proteins, and key sequences of the function of the Cas effector proteins. Performing CRISPR-Cas related protein and element prediction based on the assembled long DNA sequence, and independent of accurate judgment of species; accurately judging a coding protein gene, a clustered regularly-spaced short palindromic repeats (CRISPR/Cas) sequence, a crRNA sequence and a trans-activation crRNA sequence of the type 2 CRISPR/Cas system; further mining the structure prediction of the function of cutting DNA in the biological process, including but not limited to crRNA-tracrRNA secondary structure prediction, Cas effector protein functional domain prediction, Cas-crRNA-tracrRNA/Cas-crRNA complex structure prediction; a variety of protospacer proximity modules (PAMs) immediately downstream/upstream of the targeting sequence recognized by Cas9/Cas12 effector proteins were further explored.
As a preferred embodiment of the construction method of the present invention, in the step (1), the processing of metagenomic sequencing data specifically comprises: performing quality control, splicing and gene prediction on the data of the metagenome; the metagenome sequencing data is intestinal metagenome sequencing data; the length of the contig sequence is more than or equal to 4000bp, and the length of the protein sequence is more than or equal to 600 bp.
As a preferred embodiment of the construction method of the present invention, the step (2) is specifically: the protein sequences were clustered using orthofinder software and then expanded using the HMMer software package.
As a preferred embodiment of the construction method of the present invention, in the step (4), the relevant databases are the Swiss prot database and the NCBI nr database.
In the step (5), the software for predicting the CRISPR/Cas system is CRISPRACASFinder, and the software for analyzing the structural domain is HHpred.
The invention also aims to provide a homologous type 2 CRISPR/Cas gene editing system constructed by adopting the construction method.
As a preferred embodiment of the homologous type 2 CRISPR/Cas gene editing system of the present invention, the homologous type 2 CRISPR/Cas gene editing system comprises Cas ribonucleoprotein complex and CRISPR RNA.
As a preferred embodiment of the homologous type 2 CRISPR/Cas gene editing system of the present invention, the homologous type 2 CRISPR/Cas gene editing system further comprises an accessory protein Cas1, an accessory protein Cas2, an accessory protein Cas4, an accessory protein Csn2, an accessory protein csx27, an accessory protein csx28, an accessory protein WYL or a trans-activation CRISPR RNA.
Cas9 endonuclease is a multidomain and multifunctional DNA endonuclease whose recognition of the targeting sequence requires two important factors: one is base complementary pairing of the bound guide RNA and the complementary strand of the targeting sequence; the other is the specificity of the Adjacent Protospacer Adjacent Module (PAM) sequence of the targeting sequence. Cas9 protein efficiently cleaves double-stranded DNA complementary to guide RNA upstream of the PAM sequence of the strand of the targeting sequence through different nuclease domains. Wherein the HNH-like nuclease domain cuts a DNA strand complementary to the RNA sequence, and the RuvC-like nuclease domain is responsible for cutting non-complementary strand DNA so as to form double-stranded DNA molecule breaks (DSB); cas12 endonuclease cleaves double-stranded DNA complementary to CRISPR RNA (crRNA) downstream of the PAM sequence through different nuclease domains, creating a 5-nucleotide overhanging sticky-end. Wherein the nuclease domain is an HNH-like nuclease domain or a RuvC-like nuclease domain; cas13 is a multidomain and multifunctional RNA endonuclease. It cleaves single-stranded RNA complementary to sgRNA through two HEPN-like nuclease domains.
crRNA (crispr rna) and trans-activating crRNA in Cas9 protein system form sgRNA; the crRNA in Cas12 and Cas13 protein systems functions as guide RNA without transactivation of the crRNA. CRISPR array (array) transcript precursor crRNA (pre-crRNA) is further processed by rnase to produce crRNA, which is a base-complementary pairing moiety that guides Cas protein to recognize invading foreign DNA/RNA, this sequence is typically about 19-23bp long.
By artificially designing the target sequence in the crRNA, the CRISPR/Cas system can in principle target any DNA/RNA sequence of interest in the genome. For DNA recognition, its cleavage generates site-specific DSBs, which are further repaired by non-homologous ends, thereby generating small random insertion/deletion fragments (indels) at the cleavage site, resulting in inactivation of the gene of interest; and accurate genome modification can be carried out at the DSB site by using a homologous repair template through high-fidelity homologous repair, so that a huge gene editing modification function is realized.
As a preferred embodiment of the homologous type 2 CRISPR/Cas gene editing system, the chain length of the Cas9 ribonucleoprotein complex is 900-1500aa, and the length of CRISPR RNA is 15-36 bp. The chain length of the trans-activated CRISPR RNA complex is 70-160bp, and the length of the PAM sequence recognized by the Cas9/Cas12 ribonucleoprotein complex is 1-10 bp.
As a preferred embodiment of the homologous type 2 CRISPR/Cas gene editing system described in the present invention,
the homologous type 2 CRISPR/Cas gene editing system has 3 major effect proteins which are respectively an effect Cas9 protein, an effect Cas12 protein and an effect Cas13 protein;
the effector Cas9 proteins are 12 in number, namely C1556, C1793, C1807, C4640, C6165, Lt1, Lt2, Lt3, Lt4, Lt5, Lt6 and Lt7, and the protein sequences are respectively shown as SEQ1-SEQ 12; the number of the effect Cpf1 proteins is 1, the number is LtCpf1, the protein sequence is shown as SEQ13, the number of the effect Cas13 proteins is 2, the numbers are LtCas13b and LtCas13d, and the protein sequences are respectively shown as SEQ14-SEQ 15; wherein, in the homologous type 2 CRISPR/Cas9 gene editing system with numbers C1556, C1793, C1807, C4640 and C6165: the chain lengths of the Cas9 ribonucleoprotein complex are 1144aa, 1358aa, 1426aa, 1315aa and 1152aa respectively, the chain lengths of the complex of CRISPR RNA and the transactivation CRISPR RNA are 126bp, 124bp, 141bp, 152bp and 120bp respectively, and the PAM sequences recognized by the Cas9 ribonucleoprotein complex are W (A > T) TNTAH (A > T > C) NNAT, NATS (G > C) NY (C > T) GAT, NNTA, NNCGC and CGNGAGG respectively.
Compared with the prior art, the invention has the beneficial effects that:
(1) the excavation method of the homologous type 2 CRISPR/Cas gene editing system provided by the invention has the advantages of simple and sensitive process and strong universality;
(2) the construction method of the homologous type 2 CRISPR/Cas gene editing system provided by the invention has simple and understandable design;
(3) the construction method of the homologous type 2 CRISPR/Cas gene editing system provided by the invention is used for preliminarily finding 12 Cas9, 1 Cpf1 and 2Cas 13, wherein the 5Cas9 numbered as C1556, C1793, C1807, C4640 and C6165 have different physicochemical properties and can identify sequences of different PAMs, and the correspondingly identified PAM sequences are W (A > T) TNTAH (A > T > C) NNAT, NATS (G > C) NY (C > T) GAT, NNTA, NNCGC and CGNGAGG respectively.
Drawings
Fig. 1 is a flowchart of a method for excavating homologous type 2 CRISPR/Cas assembled by metagenomic sequences provided by the present invention.
FIG. 2 is the homologous type 2 CRISPR/Cas9 gene editing system of number C1556; wherein, fig. 2A is a composition diagram of homologous type 2 CRISPR/Cas9 gene editing system with number C1556; FIG. 2B is the structure diagram of endonuclease Cas9 protein in homologous type 2 CRISPR/Cas9 gene editing system with number C1556; fig. 2C is a CRISPR sequence diagram in the homologous type 2 CRISPR/Cas9 gene editing system, accession number C1556; FIG. 2D is a prediction diagram of the RNA secondary structure of the guide RNA molecule identified by the homologous type 2 CRISPR/Cas9 gene editing system, accession number C1556; fig. 2E is a schematic diagram of conserved PAM sequence identity for the homologous type 2 CRISPR/Cas9 gene editing system, accession number C1556.
FIG. 3 is a homologous type 2 CRISPR/Cas9 gene editing system with the number C1793, wherein FIG. 3A is a composition diagram of the homologous type 2 CRISPR/Cas9 gene editing system with the number C1793; FIG. 3B is the structure diagram of endonuclease Cas9 protein in homologous type 2 CRISPR/Cas9 gene editing system with number C1793; fig. 3C is a CRISPR sequence diagram in the homologous type 2 CRISPR/Cas9 gene editing system, numbering C1793; FIG. 3D is a prediction diagram of the RNA secondary structure of the guide RNA molecule recognized by the homologous type 2 CRISPR/Cas9 gene editing system, accession number C1793; fig. 3E is a schematic diagram of conserved PAM sequence identity for the homologous type 2 CRISPR/Cas9 gene editing system, No. C1793.
Fig. 4 is a homologous type 2 CRISPR/Cas9 gene editing system of number C1807, wherein fig. 4A is a composition diagram of the homologous type 2 CRISPR/Cas9 gene editing system of number C1807; FIG. 4B is the structure diagram of endonuclease Cas9 protein in homologous type 2 CRISPR/Cas9 gene editing system with number C1807; fig. 4C is a CRISPR sequence diagram in the homologous type 2 CRISPR/Cas9 gene editing system, numbering C1807; FIG. 4D is a prediction diagram of RNA secondary structure of guide RNA molecule recognized by homologous type 2 CRISPR/Cas9 gene editing system of accession number C1807; fig. 4E is a schematic diagram of conserved PAM sequence identity for the homologous type 2 CRISPR/Cas9 gene editing system, numbering C1807.
Fig. 5 is a homologous type 2 CRISPR/Cas9 gene editing system of number C4640, wherein fig. 5A is a composition diagram of the homologous type 2 CRISPR/Cas9 gene editing system of number C4640; FIG. 5B is the structure diagram of endonuclease Cas9 protein in homologous type 2 CRISPR/Cas9 gene editing system numbered C4640; fig. 5C is a CRISPR sequence diagram in the homologous type 2 CRISPR/Cas9 gene editing system of accession number C4640; FIG. 5D is a prediction diagram of the RNA secondary structure of the guide RNA molecule recognized by the homologous type 2 CRISPR/Cas9 gene editing system of accession number C4640; fig. 5E is a schematic diagram of conserved PAM sequence identity for the homologous type 2 CRISPR/Cas9 gene editing system of accession number C4640.
Fig. 6 is a homologous type 2 CRISPR/Cas9 gene editing system of number C6165, wherein fig. 6A is a composition diagram of the homologous type 2 CRISPR/Cas9 gene editing system of number C6165; FIG. 6B is the structure diagram of endonuclease Cas9 protein in homologous type 2 CRISPR/Cas9 gene editing system with number C6165; fig. 6C is a CRISPR sequence diagram in the homologous type 2 CRISPR/Cas9 gene editing system, numbering C6165; FIG. 6D is a prediction diagram of RNA secondary structure of guide RNA molecule recognized by homologous type 2 CRISPR/Cas9 gene editing system, accession number C6165; fig. 6E is a schematic diagram of conserved PAM sequence identity of homologous type 2 CRISPR/Cas9 gene editing system, No. C6165.
Detailed Description
To better illustrate the objects, aspects and advantages of the present invention, the present invention will be further described with reference to specific examples.
Example 1
The embodiment is a method for constructing a homologous type 2 CRISPR/Cas gene editing system (a specific flow is shown in fig. 1), which includes the following steps:
(1) processing the metagenome sequencing data, and screening to obtain a contig sequence and a protein sequence;
(2) clustering the protein sequence in the step (1) by using orthofinder software, and then expanding the clustering by using an HMMer software package;
(3) searching for clusters related to the type 2 CRISPR/Cas system in the expanded clusters of step (2) using a known hidden markov model file of type 2 CRISPR/Cas system-related effector proteins;
(4) comparing the cluster related to the type 2 CRISPR/Cas system in the step (3) with a Swiss prot and NCBI nr database, and screening to obtain a gene;
(5) performing CRISPR/Cas system prediction on contigs where the genes obtained by screening in the step (4) are located by using a CRISPRScaFinder, wherein the contigs comprise necessary auxiliary sequences and auxiliary proteins;
(6) extracting the predicted effector protein of the type 2 CRISPR/Cas system in the step (5), and analyzing the structural domain by using HHpred software;
(7) and (4) performing comparative genomics and evolution correlation analysis on the structural domain of the effector protein of the type 2 CRISPR/Cas system predicted in the step (6) and the auxiliary protein related to the effector protein of the type 2 CRISPR/Cas system to obtain the homologous type 2 CRISPR/Cas gene editing system.
Example 2
The embodiment is a construction method of a homologous type 2 CRISPR/Cas9 gene editing system, and relates to introduction of the following 4 steps of specific operations: 1. processing the metagenome sequencing data, and screening to obtain a contig sequence and a protein sequence; 2. performing CRISPR sequence related to CRISPR-Cas and Cas protein prediction on the obtained long sequence; 3. predicting the RNA secondary structure of the homologous type 2 CRISPR/Cas gene editing system; 4. and (3) predicting a PAM sequence which can be recognized by a homologous type 2 CRISPR/Cas gene editing system.
1. And preliminarily processing the metagenome data to obtain a long read-length sequence, a potential encoding gene and a corresponding protein sequence.
(1) Materials: metagenome data of intestinal tract sample in IgAN nephropathy patient
(2) Software: fastqc (v0.11.5), fastp (v0.19.8), SOAPnuke (v1.5.2), hisat2(v2.0.4), Samtools (v1.9), GeneMark (v3.38), IDBA-UD (v1.1.3), Bowtie2(v2.3.5.1), CD-HIT (v4.6), HMMER (3.1b2), BLAST (v2.3.0 +).
(3) The detection method comprises the following steps: the data source of the step is the intestinal metagenome, and all possible gene sequences and encoded protein sequences in the metagenome are obtained by performing quality control, splicing and gene prediction on the data of the intestinal metagenome, and correspond to the genome length sequence of each gene sequence;
the specific operation is as follows:
(a) firstly, using fastqc software to carry out instruction control, checking whether the data is original data or cleaned data, if the data is the original data, then using fastpc software to clean the original data and remove joints, and if the data is the cleaned data, directly entering the next step;
(b) comparing reads of the metagenome to a human reference genome (hg19) by using hisat2 software to form a file in a (Binary Alignment/Map) BAM format, extracting the reads which cannot be compared to the human reference genome by using Samtools, and obtaining a sequence file of a data part without host (human) gene pollution;
(c) then, assembling a long sequence and predicting a gene, and assembling the cleaned read data by using IDBA-UD to obtain the long sequence;
(d) after removing the short long sequence, using MetaGeneMark in GeneMark software to predict the gene, translating the obtained gene and filtering out the sequence of less than 600 amino acids;
(e) for the search of Cas13, a candidate protein in which a HEPN domain model (E … … rxxxh) exists is searched for in the protein sequence on the candidate sequence; for the search of Cas9 and Cpf1, CD-HIT and Orthofinder are used for clustering all obtained protein sequences, HMM profile made by known sequence related to type 2 CRISPR-Cas system is used for searching each cluster to obtain protein sequences possibly related to the CRISPR-Cas system;
(f) comparing the main stream database (nr, swiss prot) to obtain genes which are not annotated as CRISPR-Cas system related proteins, and obtaining 12 Cas9 protein sequences (SEQ ID NO.1-SEQ ID NO.12), 1 Cpf1 protein sequence (SEQ ID NO.13) and 2Cas 13 protein sequences (SEQ ID NO.14-SEQ ID NO.15) in the step;
(g) the corresponding long sequence is retrieved by the tag in the gene name and is ready for use in the next step.
2. CRISPR (clustered regularly interspaced short palindromic repeats)/Cas (Cas) system-related CRISPR sequence and Cas protein prediction of obtained long sequence
(1) Materials: the long sequence possibly related to the CRISPR-Cas system obtained in the step 1 (preliminary processing of metagenomic data)
(2) Software: modified CRISPRCSFinder (v4.2.19), python (v3.7.4), bendanols (v2.25.0)
(3) The detection method comprises the following steps: the step is to analyze the obtained long sequence possibly related to the type 2 CRISPR-Cas system to obtain the predicted protein related to the CRISPR-Cas system and related elements.
The method comprises the following specific operations:
(a) analyzing and predicting the obtained long sequence possibly related to the CRISPR ScasFinder system type 2 by using the CRISPRACASFinder to obtain the proteins and elements related to the CRISPR-Cas system, such as Cas protein, Spacer, direct repeat and the like;
(b) annotating candidate CRISPR-Cas system-associated proteins, for Cas13 protein system: collecting HMM files of Cas13a, Cas13b, Cas13c and Cas13d auxiliary proteins, performing HMMER alignment on long sequences, and determining whether Cas1, Cas2, csx27, csx28 and WYL auxiliary proteins exist; for the Cas9/Cpf1 protein system, HMM files of Cas9/Cpf1 helper proteins were collected, HMMER alignments were performed on the long sequences, and the presence or absence of Cas1, Cas2, Cas4, and Csn helper proteins was determined.
(c) For the Cas9 protein, the anti-repeat was searched in the predicted complementary part of the element and the corresponding DNA sequence of tracrRNA was confirmed; finally, the sequence range including all necessary elements is confirmed for later experimental synthesis.
3. Predicting RNA secondary structure
(1) Materials: predicted repeat or tracrRNA sequences
(2) Software: NUPACK (http:// www.nupack.org/partition/new)
(3) The prediction method comprises the following steps: for Cas9 protein, secondary structure prediction is carried out by simulating the interaction process of 1 μ l of each of tracrRNA and repeat transcribed RNA at 37 ℃ in vitro; for Cas12/Cas13 protein, secondary structure prediction is carried out on the obtained RNA by simulating the environment of CRISPR RNA 1 mu l at 37 ℃ in vitro
4. Prediction of PAM sequence of Cas9/Cas12
(1) Materials: spacer sequence possibly related to CRISPR-Cas system obtained in 3 (prediction of RNA secondary structure)
(2) Software: CRISPRCSFinder (v4.2.19), python (v3.7.4), bedtools (v2.25.0)
(3) A database: NCBI (Phage/Virus/Plasmid) database
The method comprises the following specific operations: and (3) comparing the Spacer sequence with a nucleic acid database of a species related to the long sequence to obtain an upstream and downstream sequence thereof, and predicting a PAM sequence corresponding to the CRISPR/Cas system by using webbloo according to a conservative principle.
Example 3
The embodiment is a homologous type 2 CRISPR/Cas9 gene editing system with the number C1556, which is a homologous CRISPR/Cas9 system mined and predicted from intestinal metagenome data of a patient 97A, and the content of the system is shown in FIG. 2.
Fig. 2A is a diagram of the composition of a homologous type 2 CRISPR/Cas9 gene editing system, numbered C1556, comprising a leader sequence at the Cas protein genomic locus (endonuclease Cas9, Cas1 protein, Cas2 protein), CRISPR array and transactivating crrna (tracrrna).
Fig. 2B is a structural diagram of endonuclease Cas9 protein in the homologous type 2 CRISPR/Cas9 gene editing system with number C1556, specifically comprising 3 domains of Pox a22, BH, HNH 4;
fig. 2C is a CRISPR sequence diagram in the homologous type 2 CRISPR/Cas9 gene editing system, accession number C1556, the CRISPR array consists of 36bp repeats and 29/30 spacer sequences as basic units;
fig. 2D is a diagram of RNA secondary structure prediction of the guide RNA molecule recognized by the homologous type 2 CRISPR/Cas9 gene editing system, No. C1556, with the sgRNA (crRNA-tracrRNA) structure comprising two bulge structures upstream and two stem loops downstream.
FIG. 2E is a schematic diagram showing the conserved PAM sequence identity of the homologous type 2 CRISPR/Cas9 gene editing system with the number C1556, wherein the PAM sequence which can be identified by the system is W (A > T) TNTAH (A > T > C) NNAT.
Example 4
The present example is homologous CRISPR/Cas9 gene editing system type 2 with number C1793, which is a system for mining predicted homologous CRISPR/Cas9 from patient 147A intestinal metagenomic data, and the content of the system is shown in fig. 3.
Fig. 3A is a diagram of the homologous type 2 CRISPR/Cas9 gene editing system composition, numbered C1793, comprising a leader sequence in which the endonuclease Cas9, CRISPR array and transactivating crrna (tracrrna) are located;
fig. 3B is a structural diagram of endonuclease Cas9 protein in the homologous type 2 CRISPR/Cas9 gene editing system with number C1793, and endonuclease Cas9 protein comprises 6 structural domains including RuvX, BH, REC, HNH4, RuvC iii, P1;
FIG. 3C is a CRISPR sequence diagram in the homologous type 2 CRISPR/Cas9 gene editing system with number C1793, and the CRISPR array consists of repeated sequence of 36bp and 30 spacer sequences as basic units;
fig. 3D is a diagram of RNA secondary structure prediction of the guide RNA molecule recognized by the homologous type 2 CRISPR/Cas9 gene editing system, No. C1793, with the sgRNA (crRNA-tracrRNA) structure comprising three bulge structures upstream and three stem loops downstream;
FIG. 3E is a schematic diagram of the conserved PAM sequence identity of the homologous type 2 CRISPR/Cas9 gene editing system with the number C1793, and the PAM sequence which can be identified by the system is NATS (G > C) NY (C > T) GAT.
Example 5
The embodiment is a homologous type 2 CRISPR/Cas9 gene editing system with the number C1807, which is a homologous CRISPR/Cas9 system mined from the 1A intestinal metagenome data of a patient, and the content of the system is shown in FIG. 4.
Fig. 4A is a diagram of the homologous type 2 CRISPR/Cas9 gene editing system composition comprising a leader sequence in which the endonuclease Cas9, CRISPR array and transactivating crrna (tracrrna) are located, numbered C1807;
fig. 4B is a structural diagram of endonuclease Cas9 protein in the homologous type 2 CRISPR/Cas9 gene editing system with number C1807, endonuclease Cas9 protein comprises 4 domains in total, including RuvX, REC, HNH4, RuvC iii;
FIG. 4C is CRISPR sequence diagram in homologous type 2 CRISPR/Cas9 gene editing system with number C1807, CRISPR array is composed of 47bp repeat sequence and 29-31 spacer sequences as basic unit;
fig. 4D is a diagram of RNA secondary structure prediction of the guide RNA molecule recognized by the homologous type 2 CRISPR/Cas9 gene editing system, numbered C1807, with the sgRNA (crRNA-tracrRNA) structure comprising five bulge structures upstream and two stem loops downstream;
fig. 4E is a schematic diagram of conserved PAM sequence identity of the homologous type 2 CRISPR/Cas9 gene editing system, numbered C1807, the PAM sequence that the system can recognize is nnnnnnta.
Example 6
The present example is homologous type 2 CRISPR/Cas9 gene editing system of number C4640, which is a system for mining predicted homologous CRISPR/Cas9 from patient 152A intestinal metagenome data, and the content of the system is shown in fig. 5.
Fig. 5A is a diagram of the homologous type 2 CRISPR/Cas9 gene editing system composition comprising a leader sequence at the Cas protein genomic locus (endonuclease Cas9 and Csn2 proteins), CRISPR array and transactivating crrna (tracrrna) numbered C4640;
FIG. 5B is the structure diagram of endonuclease Cas9 protein in homologous type 2 CRISPR/Cas9 gene editing system with number C4640, the endonuclease Cas9 protein contains 6 structural domains including RuvX, BH, REC, HNH4, RuvIII, P1;
FIG. 5C is a CRISPR sequence diagram in the homologous type 2 CRISPR/Cas9 gene editing system with number C4640, the CRISPR array is composed of 37bp repeat sequence and 28/29 spacer sequence as basic units;
fig. 5D is a diagram of RNA secondary structure prediction of the guide RNA molecule recognized by the homologous type 2 CRISPR/Cas9 gene editing system, No. C4640, sgRNA (crRNA-tracrRNA) structure comprising two bulge structures upstream and three stem loops downstream;
fig. 5E is a schematic diagram of conserved PAM sequence identity of homologous type 2 CRISPR/Cas9 gene editing system numbered C4640, the PAM sequence that this system can recognize is NNCGC.
Example 7
The present embodiment is a homologous type 2 CRISPR/Cas9 gene editing system with number C6165, which is a homologous CRISPR/Cas9 system mined from 135A intestinal metagenome data of a patient, and the content of the system is shown in fig. 6.
Fig. 6A is a diagram of the homologous type 2 CRISPR/Cas9 gene editing system composition, numbered C6165, comprising a leader sequence at the Cas protein genomic locus (endonuclease Cas9, Cas1 protein, Cas2 protein), CRISPR array and transactivating crrna (tracrrna);
fig. 6B is a structural diagram of endonuclease Cas9 protein in the homologous type 2 CRISPR/Cas9 gene editing system with number C6165, the endonuclease Cas9 protein comprises 4 structural domains including REC, HNH4 and RuvC iii;
FIG. 6C is CRISPR sequence diagram in homologous type 2 CRISPR/Cas9 gene editing system with number C6165, CRISPR array is composed of 47bp repeat sequence and 29/30 spacer sequence as basic unit;
fig. 6D is a diagram of RNA secondary structure prediction of the guide RNA molecule recognized by the homologous type 2 CRISPR/Cas9 gene editing system, No. C6165, sgRNA (crRNA-tracrRNA) structure comprising four bulge structures upstream and three stem loops downstream;
fig. 6E is a schematic diagram of the conserved PAM sequence identity of the homologous type 2 CRISPR/Cas9 gene editing system with number C6165, and the PAM sequence that can be recognized by the system is CGNGAGG.
Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present invention and not for limiting the protection scope of the present invention, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions can be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.
SEQUENCE LISTING
<110> Zhongshan university
<120> construction method of homologous type 2 CRISPR/Cas gene editing system
<130> 2020.11.18
<160> 15
<170> PatentIn version 3.5
<210> 1
<211> 1144
<212> PRT
<213> C1556 protein sequence
<400> 1
Met Glu Ser Lys Tyr Asn Tyr Arg Ile Gly Leu Asp Ile Gly Ile Ala
1 5 10 15
Ser Val Gly Trp Ala Val Leu Glu Asn Asn Ser Lys Asp Glu Pro Val
20 25 30
His Ile Met Asp Leu Gly Val Arg Ile Phe Asp Thr Ala Glu Asp Ser
35 40 45
Gln Thr Gly Asp Ser Leu Ala Ala Pro Arg Arg Asn Ala Arg Ser Met
50 55 60
Arg Arg Arg Leu Arg Arg Arg Arg His Arg Leu Glu Arg Ile Lys Gln
65 70 75 80
Leu Leu Glu Arg Glu Gly Leu Ile Gln Thr Glu Gln Phe Met Lys Arg
85 90 95
Tyr Glu Ser Lys Asp Leu Pro Asp Val Tyr Gln Leu Arg Tyr Glu Ala
100 105 110
Leu Asn Arg Arg Leu Thr Asp Asp Glu Leu Ala Gln Val Leu Ile His
115 120 125
Ile Ala Lys His Arg Gly Phe Lys Ser Asn Arg Lys Ala Glu Leu Lys
130 135 140
Glu Asp Ala Glu Ala Gly Lys Val Leu Thr Ala Thr Lys Glu Asn Arg
145 150 155 160
Glu Arg Leu Gln Ser Gly Asn Tyr Arg Thr Ile Gly Glu Met Ile Tyr
165 170 175
Cys Asp Thr Ala Phe Gln Thr Glu Cys Asp Trp Val Glu Lys Gly Tyr
180 185 190
Ile Leu Thr Pro Arg Asn Lys Ala Asp Ser Tyr Lys His Thr Ile Glu
195 200 205
Arg Ala Leu Leu Val Glu Glu Val Lys Lys Ile Phe Glu Ala Gln Arg
210 215 220
Thr Phe Gly Asn Glu Arg Ala Thr Glu Lys Leu Glu Glu Asn Tyr Leu
225 230 235 240
Ser Ile Met Glu Ser Gln Arg Ser Phe Asp Met Gly Pro Gly Leu Gln
245 250 255
Ala Asp Gly Ser Glu Ser Pro Phe Ala Leu Asn Gly Phe Glu Asp Lys
260 265 270
Val Gly Tyr Cys Thr Leu Glu Gly Lys Ser Glu Lys Arg Gly Ala Lys
275 280 285
Ala Thr Tyr Thr Ala Glu Leu Phe Val Val Ser Gln Lys Leu Ser His
290 295 300
Leu Lys Leu Leu Lys Arg Gly Gly Glu Gly Arg Phe Leu Thr Lys Glu
305 310 315 320
Glu Lys Ala Thr Val Leu Ala Leu Leu His Thr Gln Lys Glu Val Lys
325 330 335
Tyr Ser Ala Ile Arg Lys Lys Leu Asn Ile Pro Ser Glu Tyr His Phe
340 345 350
Asn Thr Leu Thr Tyr Thr Ser Lys Lys Lys Asp Leu Ser Ala Glu Glu
355 360 365
Arg Glu Lys Glu Val Glu Lys Ala Thr Phe Gly Lys Leu Glu Asn Tyr
370 375 380
His Lys Met Met Lys Cys Leu Asn Asp Glu Thr Lys Gln Arg Pro Ala
385 390 395 400
Glu Glu Leu Gln Glu Leu Leu Asp Ser Ile Ala Thr Thr Leu Thr Leu
405 410 415
Tyr Lys Ser Asp Asp Lys Arg Arg Glu Cys Leu Lys Glu Leu Pro Leu
420 425 430
Ile Asp Glu Glu Ile Glu Asn Leu Leu Glu Leu Thr Phe Ala Lys Phe
435 440 445
Met Asn Leu Ser Val Lys Ala Met Arg Lys Leu Ile Pro Tyr Leu Gln
450 455 460
Gly Glu Glu Ser Met Thr Tyr Asp Lys Ala Cys Ala Val Ala Gly Tyr
465 470 475 480
Asp Phe Lys Ala Glu Gly Gly Glu Tyr Lys Ser Lys Phe Leu Lys Gly
485 490 495
Glu Arg Val Thr Glu Ile Ile Asn Asp Ile Pro Asn Pro Val Val Lys
500 505 510
Arg Ser Val Ser Gln Thr Val Lys Val Ile Asn Ala Ile Ile Gln Lys
515 520 525
Tyr Gly Ser Pro Gln Ala Val Tyr Ile Glu Leu Ala Arg Glu Met Ala
530 535 540
Lys Asn Phe Lys Asp Arg Lys Asp Ile Glu Arg Lys Asn Lys Ala Arg
545 550 555 560
Glu Ala Asp Asn Glu Arg Ile Lys Lys Gln Ile Gln Glu Tyr Gly Ile
565 570 575
Ser Ser Pro Thr Gly Gln Asp Ile Ile Lys Phe Arg Leu Trp Gln Glu
580 585 590
Gln Asp Gly Ile Cys Met Tyr Ser Gly Glu Arg Met Ser Ile Glu Asn
595 600 605
Leu Phe Gly Lys Asn Ser Ser Cys Asp Ile Asp His Ile Leu Pro Tyr
610 615 620
Ser Gln Thr Phe Asp Asp Ser Tyr His Asn Lys Val Leu Val Leu Ser
625 630 635 640
Arg Glu Asn Arg Gln Lys Gly Asn Gln Thr Pro Tyr Glu Tyr Leu Gly
645 650 655
Gln Asp Val Lys Arg Trp Asn Glu Phe Val Ala Arg Val Ser Ala Leu
660 665 670
Glu Lys Asp Gly Lys Lys Arg Glu His Phe Phe Lys Glu His Ile Thr
675 680 685
Glu Glu Asp Lys Lys Gln Phe Lys Glu Arg Asn Leu Asn Asp Thr Lys
690 695 700
Tyr Ile Thr Thr Ile Val Tyr Asn Leu Ile Arg Gln His Leu Glu Leu
705 710 715 720
Ala Pro Tyr Asn Val Pro Gly Lys Lys Lys Gln Val Lys Ala Val Asn
725 730 735
Gly Val Ile Thr Ser Tyr Leu Arg Lys Arg Trp Gly Leu Pro His Lys
740 745 750
Asp Arg Ser Thr Asp Thr His His Ala Met Asp Ala Val Thr Ile Ala
755 760 765
Cys Cys Thr Glu Gly Met Ile Asn Lys Ile Ser Arg Ser Met Gln Ile
770 775 780
Leu Glu Leu Arg Tyr Met Arg Asn Gly Arg Met Val Asp Glu Glu Thr
785 790 795 800
Gly Glu Ile Tyr Asp Arg Glu Asn Tyr Thr Leu Gln Glu Trp Lys Asp
805 810 815
Lys Phe Gly Val His Ile Pro Arg Pro Trp Glu Thr Phe Lys Asp Glu
820 825 830
Leu Asp Val Arg Met Gly Glu Asp Pro Leu Asn Phe Leu Asp Thr His
835 840 845
Pro Asp Val Ala Lys Glu Leu Asp Tyr Pro Glu Tyr Tyr Tyr Pro Asp
850 855 860
Glu Lys Ser Asn Trp Lys Gly Phe Ile Arg Pro Ile Phe Val Ser Arg
865 870 875 880
Met Pro Asn His Lys Val Thr Gly Gln Ala Asn Lys Asp Thr Val Arg
885 890 895
Ser Pro Lys Leu Phe Glu Glu Gly Tyr Val Ile Ser Lys Val Pro Leu
900 905 910
Thr Ser Leu Lys Leu Lys Asp Gly Glu Ile Ala Asn Tyr Tyr Asn Lys
915 920 925
Glu Ser Asp Met Leu Leu Tyr Asn Ala Leu Lys Arg Gln Leu Glu Leu
930 935 940
Tyr Ser Asn Asp Ala Ala Lys Ala Phe Val Gln Pro Phe His Lys Pro
945 950 955 960
Lys Ala Asp Gly Asn Gln Gly Pro Val Val Lys Lys Val Lys Val Val
965 970 975
Asp Lys Gln Ser Ser Gly Val Tyr Val His Glu Gly Lys Gly Ile Thr
980 985 990
Ala Asn Gly Asp Met Ile Arg Ile Asp Ile Phe Arg Glu Asn Glu Lys
995 1000 1005
Tyr Tyr Phe Val Pro Ile Tyr Ala Ala Asp Val Val Lys Lys Val
1010 1015 1020
Leu Pro Asn Lys Ala Ala Thr Ala His Lys Asn Tyr Asp Glu Trp
1025 1030 1035
Arg Glu Met Lys Asp Glu Asn Phe Val Phe Ser Leu Tyr Pro Lys
1040 1045 1050
Asp Leu Ile His Val Lys Thr Lys Lys Glu Asp Gly Leu Lys Leu
1055 1060 1065
Lys Met Ala Asn Gly Gly Met Val Lys Arg Gln Glu Glu Tyr Val
1070 1075 1080
Tyr Phe Ser Val Ala Asn Ile Ser Thr Ala Ser Ile Ala Gly Phe
1085 1090 1095
Ala Asn Asp Lys Ser Phe Ser Phe Glu Gly Leu Gly Ile Gln Ser
1100 1105 1110
Leu Thr Ile Phe Glu Lys Cys Gln Val Asp Val Leu Gly Asn Ile
1115 1120 1125
Ser Val Val Lys His Glu Lys Arg Met Asp Phe Ser Ala Lys Lys
1130 1135 1140
His
<210> 2
<211> 1358
<212> PRT
<213> C1793 protein sequence
<400> 2
Met Glu Lys Leu Gln Lys Tyr Phe Leu Gly Leu Asp Ile Gly Thr Asn
1 5 10 15
Ser Val Gly Tyr Ala Ala Thr Gly Lys Asp Tyr Asp Leu Leu Lys Phe
20 25 30
Arg Gly Glu Pro Val Trp Gly Val Thr Thr Phe Glu Glu Ala Ser Leu
35 40 45
Ala Glu Glu Arg Arg Ile Asn Arg Ala Ser Arg Arg Gln Leu Asp Arg
50 55 60
Cys Gln Gln Arg Ile Ser Met Leu Gln Glu Ile Phe Ala Pro Tyr Ile
65 70 75 80
Cys Gln Thr Asp Pro Asn Phe Phe Val Arg Arg Ala Glu Ser Ala Leu
85 90 95
Phe Ala Glu Asp Ser Gln Gln Gly Val Arg Ile Phe Asp Gly Gly Ile
100 105 110
Asp Asp Lys Glu Tyr His Arg Lys Tyr Pro Thr Ile His His Leu Ile
115 120 125
Val Glu Leu Met Ala Ala Asp Ser Pro Arg Asp Ile Arg Leu Val Tyr
130 135 140
Leu Ala Cys Ala Trp Leu Val Ala Asn Arg Gly His Phe Leu Ser Glu
145 150 155 160
Ala Lys Ala Asp Ser Val Val Asp Phe Gln Lys Pro Tyr Thr Glu Phe
165 170 175
Leu Ser Ser Phe Thr Trp Asn Tyr Gln Cys Gln Pro Pro Trp Ala Ala
180 185 190
Ser Val Ser Cys Glu Thr Ile Gln Glu Ile Met Gln Ala Gln Val Gly
195 200 205
Ile Thr Gln Arg Lys Ala Lys Phe Lys Ala Glu Val Tyr Gly Gly Lys
210 215 220
Ser Pro Ser Lys Lys Pro Glu Glu Glu Phe Pro Phe Ser Arg Asp Ala
225 230 235 240
Ile Val Ser Leu Leu Ser Gly Gly Lys Val Ser Pro Ala Glu Leu Phe
245 250 255
Gly Asn Glu Ala Tyr Lys Gly Leu Asp Ser Val Ala Leu Ser Met Asp
260 265 270
Glu Glu Asn Phe Glu Arg Ile Leu Ser Glu Leu Gly Asp Asp Ala Glu
275 280 285
Leu Leu Arg Asn Met Gln Ala Met Tyr Asp Cys Ala Leu Leu Asn Ile
290 295 300
Thr Leu Gln Gly Lys Arg Ser Ile Ser Glu Ala Lys Val Ala Val Tyr
305 310 315 320
Glu Gln His Lys Glu Asp Leu Gly Phe Leu Lys Tyr Phe Ile Arg Lys
325 330 335
Tyr Cys Arg Gln Asn Tyr Ser Lys Val Phe Arg Ser Ala Ala Lys Asp
340 345 350
Asn Tyr Thr Ala Tyr Ser Gly Asn Val Lys Ser Cys Lys Glu Ala Asn
355 360 365
Lys Val Ala Arg Thr Asn Lys Asn Ala Phe Cys Asp Phe Val Lys Lys
370 375 380
Leu Val Lys Asp Ile Arg Pro Glu Lys Glu Asp Leu Ala Lys Tyr Glu
385 390 395 400
Asn Met Met Glu Arg Leu Ser Leu Tyr Thr Phe Leu Pro Lys Gln Lys
405 410 415
Asp Ser Asp Asn Arg Val Ile Pro His Gln Leu Tyr Glu Val Glu Leu
420 425 430
Asp Gln Ile Leu Ala Asn Ala Ser Met His Leu Thr Met Leu Gly Asn
435 440 445
Ala Asp Ala Asn Gly Ile Thr Asn Ala Asp Lys Ile Arg Ala Ile Phe
450 455 460
Arg Phe Arg Val Pro Tyr Tyr Val Gly Pro Leu Asn Gln Lys Ser Pro
465 470 475 480
Tyr Ala Trp Leu Glu Arg Lys Pro Glu Lys Ile Tyr Pro Trp Asn Phe
485 490 495
Glu Lys Ile Val Asp Leu Asp Lys Ser Glu Gln Asn Phe Ile Arg Arg
500 505 510
Met Thr Asn Thr Cys Thr Tyr Leu Pro Gly Glu Ala Val Leu Pro Tyr
515 520 525
Cys Ser Leu Leu Tyr Ser Arg Tyr Met Val Leu Asn Glu Ile Asn Asn
530 535 540
Leu Lys Ile Asn Gln His Ala Ile Ser Ile Pro Leu Lys Gln Lys Ile
545 550 555 560
Tyr Gln Asp Leu Phe Glu Asn Ser Gly Lys Lys Val Thr Lys Lys Ala
565 570 575
Ile Ala Gly Tyr Leu Arg Ser Gln Gly Leu Met Thr Ser Ser Asp Glu
580 585 590
Ile Ser Gly Val Asp Asp Thr Leu Lys Ala Asn Leu Lys Pro Tyr His
595 600 605
Ser Phe Arg His Met Leu Ser Ala Gly Thr Leu Ser Glu Glu Gln Val
610 615 620
Glu Asp Ile Ile Leu His Ala Ala Tyr Ser Glu Asp Lys Ser Arg Met
625 630 635 640
Gly Arg Trp Leu Glu Ser His Tyr Pro Val Leu Ser Glu Gln Asn Arg
645 650 655
Lys Tyr Ile Leu Arg Leu Asn Leu Lys Gly Phe Gly Arg Leu Ser Gly
660 665 670
Arg Phe Leu Thr Gly Ile Val Cys Thr Gln Gly Asn Asp Arg Gly Glu
675 680 685
Ala Met Ser Ile Ile Asp Ala Leu Trp Gln Thr Asn Asn Asn Leu Met
690 695 700
Gln Leu Leu Ser Ser Ser Tyr Thr Phe Gln Asn Gln Ile Cys Glu Phe
705 710 715 720
Ala Ala Asp Tyr Tyr Ser Glu Pro Ser His Lys Lys Thr Leu Ser Gln
725 730 735
Arg Leu Asp Asn Met Tyr Val Ser Asn Thr Val Lys Arg Gln Ile Ile
740 745 750
Arg Ser Leu Asp Val Cys Ser Asp Val Val Lys Ala Met Lys Asn Ala
755 760 765
Pro Glu Arg Ile Phe Val Glu Met Ala Arg Gly Thr Val Glu Asp Gln
770 775 780
Lys Gly Lys Arg Thr Lys Ser Arg Lys Gln Gln Leu Leu Asp Leu Tyr
785 790 795 800
Asn Gln Val Arg Cys Glu Asp Ala Pro Glu Leu Leu Ala Asn Leu Glu
805 810 815
Ala Met Gly Asp Glu Ala Asp Asn Arg Leu Gln Ser Asp Lys Leu Phe
820 825 830
Leu Tyr Tyr Leu Gln Leu Gly Lys Cys Ala Tyr Thr Gly Gln Pro Ile
835 840 845
Glu Leu Glu Gln Leu Ala Ser Lys Ala Tyr Asp Ile Asp His Ile Tyr
850 855 860
Pro Gln Ser Lys Val Gln Asp Asp Ser Ile Leu Asn Asn Lys Val Leu
865 870 875 880
Cys Leu Ser Thr Ala Asn Gly Glu Lys Gly Asp Leu Phe Pro Ile Arg
885 890 895
Leu Glu Ile Gln Lys Lys Met Gln Pro Phe Trp Ser Tyr Leu Lys Asn
900 905 910
Ala Gly Leu Met Asn Glu Glu Lys Tyr Ser Arg Leu Thr Arg Thr Phe
915 920 925
Leu Phe Thr Ala Asp Glu Leu His Gln Phe Ile Asn Arg Gln Leu Val
930 935 940
Glu Thr Arg Gln Ser Thr Lys Val Val Thr Gln Leu Leu Gln Glu Arg
945 950 955 960
Tyr Pro Glu Thr Glu Ile Val Tyr Val Lys Ala Cys Leu Val Ser Arg
965 970 975
Phe Arg Gln Glu Phe Gly Met Leu Lys Cys Arg Ser Val Asn Asp Leu
980 985 990
His His Ala Lys Asp Ala Tyr Leu Asn Ile Val Val Gly Asn Val Tyr
995 1000 1005
His Glu Lys Phe Thr Arg Ile Asn Ile Glu Glu Ser Tyr Ser Leu
1010 1015 1020
Asn Leu Lys Pro Leu Phe Glu Asn Ser Asn Leu Thr Val Trp Lys
1025 1030 1035
Gly Lys Gly Ser Leu Ala Lys Val Arg Lys Val Ile Gly Lys Asn
1040 1045 1050
Ala Val His Val Thr Arg Tyr Ala Phe Cys Arg Lys Gly Gly Leu
1055 1060 1065
Phe Asp Gln Gln Pro Lys Lys Ala Gln Glu Gly Leu Val Pro Leu
1070 1075 1080
Lys Lys Ser Leu Pro His Asp Leu Lys Arg Glu Leu Pro Thr Glu
1085 1090 1095
Lys Tyr Gly Gly Tyr Asn Lys Pro Thr Ala Ser Phe Tyr Leu Leu
1100 1105 1110
Ser Ala Tyr Ser Val Gly Asn Lys Lys Asp Ile Met Phe Val Pro
1115 1120 1125
Val Glu Leu Arg Tyr Ala Ala Arg Val Leu Ser Asp Thr Glu Phe
1130 1135 1140
Ala Ile Trp Tyr Thr Phe Arg Glu Ile Ser Lys Ile Asn Gly Asn
1145 1150 1155
Lys Pro Val Ser Asp Val Arg Ile Leu Leu Asn Gly Arg Pro Leu
1160 1165 1170
Lys Ile Asn Thr Val Leu Leu Leu Asp Gly Met Pro Val Thr Ile
1175 1180 1185
Arg Gly Lys Thr Asn Lys Gly Ala Gln Ile Ile Val Ala Ser Gln
1190 1195 1200
Leu Pro Leu Ile Leu Pro Asp Gln Ala Glu Gln Tyr Ile Lys Arg
1205 1210 1215
Leu Glu Ser Phe Ser Thr Lys Lys Lys Ala Asn Asp Lys Leu Arg
1220 1225 1230
Leu Asn Glu Lys Tyr Asp His Ile Ser Arg Glu Gln Asn Val Glu
1235 1240 1245
Leu Tyr Thr Leu Leu Ser Gln Lys Leu Lys His Ser Ile Phe Ala
1250 1255 1260
Lys Cys Pro Gly Ser Ile Ala Arg Thr Val Glu Asp Gly Tyr Glu
1265 1270 1275
Lys Phe Ser Gln Leu Asp Pro Glu Asp Gln Ile Gly Cys Leu Met
1280 1285 1290
Gly Ile Val Ser Trp Phe Gly Ser Gln Ser Asn Gly Ile Asp Leu
1295 1300 1305
Thr Leu Ile Gly Gly Lys Ala Ala Thr Gly Thr Lys Leu Ile Asn
1310 1315 1320
Ala Lys Phe Ser Asn Leu Ala Lys Asn Ile Arg Asp Leu Arg Ile
1325 1330 1335
Val Asp Thr Ser Ala Ser Gly Leu Tyr Val Ser Arg Ser Glu Asn
1340 1345 1350
Leu Leu Gly Leu Leu
1355
<210> 3
<211> 1426
<212> PRT
<213> C1807 protein sequence
<400> 3
Met Lys Val Ala Gly Phe Asp Leu Gly Thr Asn Ser Ile Gly Ile Ala
1 5 10 15
Val Arg Asp Thr Glu Lys Ser Asp Glu Leu Thr Glu Gln Leu Asp Tyr
20 25 30
Phe Ser Val Val Thr Phe Pro Cys Gly Val Gly Thr Gly Lys Thr Gly
35 40 45
Glu Phe Ser Tyr Ala Ala Glu Arg Thr Gln Lys Arg Ser Gln Arg Lys
50 55 60
Leu Tyr Lys Val Arg Arg Tyr Arg Lys Trp Ser Thr Leu Ala Leu Leu
65 70 75 80
Ile Gln Tyr Asp Phe Cys Pro Leu Ser Leu Glu Glu Leu Asp Arg Trp
85 90 95
Arg Val Tyr Asp Lys Glu Arg Gly Leu Lys Arg Gln Tyr Pro Val Glu
100 105 110
Ser Glu Thr Phe Ala Arg Trp Ile Arg Leu Asp Phe Asp Gly Asp Gly
115 120 125
Met Pro Asp Cys Thr Pro Tyr Gln Leu Arg Lys Ser Leu Val Glu Thr
130 135 140
Lys Leu Asp Leu Ser Leu Pro Ser Asp Arg Tyr Lys Val Gly Arg Ala
145 150 155 160
Leu Tyr His Ile Ala Gln Arg Arg Gly Phe Lys Ser Ser Arg Gly Glu
165 170 175
Thr Ala Lys Glu Gln Glu Ser Ala Glu Gln Ala Asp Asp Ser Asn Glu
180 185 190
Ala Val Ser Leu Gln Lys Ser Glu Glu Lys Lys Ser Lys Leu Leu Thr
195 200 205
Asp Phe Met Lys Ala His Asp Cys Pro Thr Val Gly Tyr Ala Leu Ala
210 215 220
Leu Leu Ile Glu Gln Gly Ile Arg Val Arg Gly Ser Glu Tyr Gln Val
225 230 235 240
Val Arg Ser Gln Tyr Met Glu Glu Val Glu Thr Ile Phe Arg Phe Gln
245 250 255
Gly Met Tyr Glu Arg Phe Pro Glu Phe Cys Gln Gly Ile Leu Ser Arg
260 265 270
Lys Lys Gly Glu Gly Thr Ile Phe Tyr Lys Arg Pro Leu Arg Ser Gln
275 280 285
Arg Gly Leu Val Gly Lys Cys Thr Leu Glu Pro Asn Lys Pro Arg Cys
290 295 300
Pro Ile Ala His Pro Asp Phe Glu Glu Phe Arg Ala Leu Val Phe Leu
305 310 315 320
Asn Asn Ile Lys Tyr Arg Gln Ser Ala Glu Asp Pro Trp Leu Thr Leu
325 330 335
Asp Ser Asp Met Lys Glu Glu Leu Met Ala Ser Lys Phe Tyr Arg Cys
340 345 350
Gln Thr Thr Phe Pro Phe Lys Glu Ile Arg Glu Trp Leu Glu Lys Lys
355 360 365
Thr Gly Trp Ser Leu Ser Lys Lys Gly Lys Thr Val Asn Tyr Ser Asp
370 375 380
Arg Asp Ser Val Ser Ala Ser Pro Val Thr Ala Arg Leu Arg His Leu
385 390 395 400
Leu Gly Asp Asp Trp Lys Asn Trp Thr Phe Val Ser Glu Phe Glu Lys
405 410 415
Cys Gly His Asp Lys Val Ala Arg Lys Ala Val Tyr Thr Ala Tyr Asp
420 425 430
Ile Trp His Val Cys Tyr Asp Cys Asp Asp Glu Asp Phe Leu Cys Asp
435 440 445
Phe Ala Ala Asn Lys Leu Ser Phe Asp Ser Gln Gln Thr Lys Lys Leu
450 455 460
Val Arg Leu Tyr Glu Glu Met Arg Gln Gly Tyr Ala Met Leu Ser Leu
465 470 475 480
Lys Ala Ile Arg Asn Ile Leu Pro Phe Leu Arg Lys Gly Tyr Ile Tyr
485 490 495
Ser Thr Ala Val Met Leu Ala Lys Ile Pro Glu Ile Val Gly Arg Glu
500 505 510
Lys Trp Asn Glu Tyr Glu Lys Asp Phe Val Ala Gln Val His Leu Leu
515 520 525
Gln Gln Arg Val Ser Asp Glu Arg Leu Val Leu Asn Ile Val Asn Thr
530 535 540
Leu Ile Ala Asn Tyr Lys Ser Asn Asp Tyr Glu Glu Arg Trp Ala Glu
545 550 555 560
His Asn Tyr Asp Tyr Val Leu Asp Asp Ser Asp Arg Ala Asp Val Leu
565 570 575
Lys Ala Ala Val Gly Cys Ile Gly Gln Lys Thr Trp Glu Ser Lys Pro
580 585 590
Gln Ala Glu Arg Asp Glu Leu Leu Lys Lys Val Glu Glu Arg Tyr Gln
595 600 605
Ala Phe Phe Ala Asp His Lys Arg Lys Tyr Tyr Glu Met Pro Lys Met
610 615 620
Lys Asp Val Leu Lys Ala Glu Leu Val Gly Val Phe Ser Asp Ile Glu
625 630 635 640
Ala Lys Ala Phe Asp Arg Leu Tyr His Pro Ser Asp Phe Glu Val Tyr
645 650 655
Pro His Ser Ala Thr Gly Leu Leu Gly Ser Pro Val Thr Gly Ala Leu
660 665 670
Lys Asn Pro Met Ala Met Arg Val Leu Tyr Ser Leu Arg Arg Glu Val
675 680 685
Asn Ala Leu Leu Lys Lys Gly Ile Ile Asp Ala Asp Thr Arg Ile Val
690 695 700
Ile Glu Thr Ala Arg Glu Leu Asn Asp Ala Asn Met Arg Ala Ala Val
705 710 715 720
Ala Asp Tyr Gln Lys Lys Arg Glu Lys Glu Asn Gln Lys Ile Arg Glu
725 730 735
Ile Leu Val Glu Leu Leu Gly Lys Arg Asp Tyr Ser Asp Thr Glu Val
740 745 750
Asp Lys Thr Arg Phe Leu Leu Glu Gln His Asp Ile Leu Asp Val Ala
755 760 765
Ala Met Pro Gln Ala Glu Lys Lys Thr Lys Lys Asn Thr Glu Lys Gln
770 775 780
Lys Ala Glu Lys Phe Asp Arg Asp Thr Thr Lys Tyr Arg Leu Trp Leu
785 790 795 800
Glu Gln Gly Cys Arg Cys Ile Tyr Thr Gly Lys Val Ile Asn Ile Thr
805 810 815
Ser Leu Phe Asp Asp Asn Leu Tyr Glu Ile Glu His Thr Ile Pro Arg
820 825 830
Ser Leu Ser Phe Asp Asn Ser Gln Ala Asn Leu Thr Val Cys Asp Ala
835 840 845
His Phe Asn Arg His Val Lys Lys Asn Arg Ile Pro Thr Gln Leu Asp
850 855 860
Asn Tyr Asp Glu Ile Tyr Met Arg Ile Thr Pro Trp Val Glu Lys Val
865 870 875 880
Glu Gln Leu Ala Asp Arg Val Lys Ala Trp Thr Asp Ala Ala Lys Arg
885 890 895
Ala Gln Asp Lys Asp Arg Lys Asp Tyr Cys Ile Arg Gln Arg His Gln
900 905 910
Trp Gln Met Glu Leu Asp Tyr Trp Arg Asp Lys Val Asn Arg Phe Thr
915 920 925
Met Thr Glu Val Thr Asp Gly Phe Arg Asn Ser Gln Leu Val Asp Thr
930 935 940
Arg Ile Ile Thr Lys Tyr Ala Tyr His Phe Leu Lys Thr Val Phe Glu
945 950 955 960
Arg Val Asp Val Gln Lys Gly Ala Val Thr Ala Asp Phe Arg Lys Met
965 970 975
Leu Gly Val Gln Ser Leu Glu Glu Lys Lys Asn Arg Asp Arg His Ser
980 985 990
His His Ala Ile Asp Ala Thr Val Leu Thr Leu Val Pro Ala Ala Asn
995 1000 1005
Val Arg Asp Arg Leu Leu Gln Leu Phe Tyr Glu Lys Glu Glu Ala
1010 1015 1020
Glu Arg Phe Ser Gly Asp Ala Gly Leu Lys Glu Arg Ala Phe Leu
1025 1030 1035
Ser Glu Leu Lys Arg Leu Asn Phe Lys Gly Val Ser Arg Leu Pro
1040 1045 1050
Gly Phe Val Asp Glu Lys Ile Leu Ile Asn His Thr Val Arg Asp
1055 1060 1065
Arg Ala Leu Val Pro Ala Ser Arg Lys Leu Arg Arg Arg Gly Lys
1070 1075 1080
Glu Val Leu Phe Asp Gly Lys Asn Arg Arg Met Thr Gly Asp Cys
1085 1090 1095
Ile Arg Gly Gln Leu His Gly Asp Thr Phe Tyr Gly Ala Ile Arg
1100 1105 1110
Gln Tyr Cys Phe Asp Asp Asp His Arg Val Ile Arg Asp Ala Asp
1115 1120 1125
Gly Lys Pro Cys Glu Thr Asp Val Met Tyr Val Val Arg Lys Glu
1130 1135 1140
Phe Arg Ala Lys Gly Asn Ser Gly Asp Ala Gly Gly Phe Ala Ser
1145 1150 1155
Trp Ala Asp Val Glu Lys Ser Ile Val Asp Lys Ala Leu Tyr Arg
1160 1165 1170
Met Met Arg Ser Gln Phe Ala Asp Asp Val Ser Phe Lys Asp Ala
1175 1180 1185
Tyr Ala Gln Gly Ile Tyr Met Leu Asp Arg Ala Gly Arg Arg Val
1190 1195 1200
Asn Lys Ile Arg His Ile Arg Cys Trp Ala Ser Arg Val Asn Asn
1205 1210 1215
Pro Leu Val Val Arg Lys His Thr His Leu Ser Asp Arg Glu Tyr
1220 1225 1230
Lys Gln Asn Tyr Tyr Ala Val Asn Gly Glu Asn Thr Tyr Leu Ala
1235 1240 1245
Ile Tyr Trp Asp Gly Val Ser Asn Asn Asn Asp Phe Asp Cys Lys
1250 1255 1260
Ser Leu Met Asp Leu Ala Arg Met Lys Pro Thr Asp Arg Asn Pro
1265 1270 1275
Leu His Phe Phe Pro Pro Gln Lys Lys Val Gly Lys Gly Lys Lys
1280 1285 1290
Ala Val Phe Met Pro Leu Tyr Ala Val Leu Lys Pro Gly Thr Arg
1295 1300 1305
Val Leu Val Phe Asn Pro Asp Glu Phe Gly Asn Lys Gln Arg Leu
1310 1315 1320
Thr Ile Asp Gly Tyr Lys His Ile Ile Gln His Leu Asp Arg Ser
1325 1330 1335
Gln Leu Met Arg Arg Leu Tyr His Met Val Gly Phe Asp Ser Gly
1340 1345 1350
Asp Gly Arg Ile Gln Phe Lys Tyr His Leu Glu Ala Arg Asn Asp
1355 1360 1365
Asn Arg Leu Met Glu Ala Phe Pro Glu Asn Lys Tyr Gly Lys Val
1370 1375 1380
Gly Lys Ile Gly Phe Ser Tyr Phe Asp Leu Glu Leu Glu Gln Pro
1385 1390 1395
Lys Leu Arg Leu Ser Arg Ser Asn Tyr Tyr Phe Leu Ile Glu Gly
1400 1405 1410
Lys Asp Phe Glu Ile Asp Gly Asp Thr Ile Val Phe Lys
1415 1420 1425
<210> 4
<211> 1314
<212> PRT
<213> C4640 protein sequence
<400> 4
Met Glu Asn Tyr Tyr Leu Gly Leu Asp Leu Gly Thr Asp Ser Ile Gly
1 5 10 15
Trp Ala Val Thr Asp Lys Asn Tyr Asn Ile Pro Lys Phe Lys Gly Asn
20 25 30
Ser Met Trp Gly Ile Arg Leu Leu Glu Gly Gly Asn Thr Ala Val Glu
35 40 45
Arg Arg Glu Phe Arg Ser Ser Arg Arg Arg Leu Glu Arg Asn Lys Tyr
50 55 60
Arg Leu Asn Cys Leu Glu Met Leu Phe Asn Glu Glu Ile Ser Lys Lys
65 70 75 80
Asp Ile Ala Phe Phe Gln Arg Leu Lys Asp Ser Ala Leu Tyr Asp Gly
85 90 95
Asp Lys Asn Val Ser Gly Lys Tyr Ser Leu Phe Asn Asp Pro Asp Tyr
100 105 110
Thr Asp Lys Asp Tyr Tyr Lys Lys Tyr Pro Thr Ile Tyr His Leu Arg
115 120 125
Lys Glu Leu Ile Glu Ser Val Glu Pro His Asp Val Arg Leu Val Phe
130 135 140
Leu Ala Leu His His Ile Ile Lys Asn Arg Gly His Phe Leu Phe Asp
145 150 155 160
Asn Asp Asp Leu Gly Lys Asn Gly Asn Phe Asp Phe Thr Ala Ile Phe
165 170 175
Thr Glu Leu Asn Glu Tyr Ile Lys Ser Asn Ser Asn Tyr Glu Ser Glu
180 185 190
Gly Phe Ser Cys Asn Asp Leu Lys Pro Val Glu Asp Ile Ile Lys Asn
195 200 205
Ser Ser Leu Thr Ser Ser Lys Lys Lys Glu Ala Leu Ile Lys Glu Phe
210 215 220
Ala Leu Asn Lys Lys Val Asp Val Phe Glu Val Ala Ile Val Thr Leu
225 230 235 240
Leu Ser Gly Ala Thr Ala Lys Ala Lys Asp Leu Phe Asn Thr Asp Lys
245 250 255
Tyr Asp Asp Thr Glu Gly Lys Ser Ile Cys Phe Lys Ser Gly Tyr Asp
260 265 270
Asp Lys Ala Thr Thr Tyr Glu Ser Val Phe Gly Glu Asn Phe Glu Leu
275 280 285
Ile Glu Lys Leu Lys Ala Ile Tyr Asp Trp Ala Ile Leu Ala Asp Ile
290 295 300
Leu Asn Gly Lys Glu Tyr Ile Ser Tyr Ala Lys Val Asp Thr Tyr Glu
305 310 315 320
Lys His Lys Arg Asp Leu Lys Leu Leu Lys Glu Tyr Val Lys Ala Tyr
325 330 335
Val Pro Glu Lys Tyr Ser His Ile Phe Asn Glu Asn Ser Asp Lys Val
340 345 350
Ser Asn Tyr Leu Ser Tyr Ser Gly Tyr Ser Ser Lys Asn Pro Val Met
355 360 365
Lys Lys Cys Asp Gln Glu Thr Phe Cys Asp Phe Leu Arg Lys Gln Leu
370 375 380
Pro Lys Glu Leu Leu Asp Glu Lys Tyr Ala Gln Met Tyr Lys Glu Ile
385 390 395 400
Glu Thr Ser Ser Phe Met Pro Lys Ala Val Thr Lys Asp Asn Ser Val
405 410 415
Ile Pro Met Gln Leu Asn Arg Ala Glu Leu Glu Ala Ile Leu Ser Asn
420 425 430
Ala Lys Asn Tyr Leu Pro Phe Leu Ala Asp Arg Asp Ser Ser Gly Lys
435 440 445
Thr Val Ser Glu Lys Ile Ile Asp Ile Phe Lys Tyr Arg Ile Pro Tyr
450 455 460
Tyr Val Gly Pro Leu Asn Lys His Ser Glu Lys Ser Trp Leu Val Arg
465 470 475 480
Thr Gly Glu Lys Ile Tyr Pro Trp Asn Phe Glu Asn Val Val Asp Ala
485 490 495
Asp Asn Ser Ala Glu Lys Phe Ile Asn Asn Leu Thr Ser Lys Cys Thr
500 505 510
Tyr Leu Pro Lys Glu Asp Val Ile Pro Lys Asn Ser Leu Leu Tyr Ser
515 520 525
Ala Phe Thr Val Leu Asn Glu Ile Asn Asn Ile Lys Ile Asp Gly Glu
530 535 540
Glu Ile Ser Val Glu Leu Lys Gln Gly Ile Tyr Lys Asp Leu Phe Glu
545 550 555 560
Lys Ser Asn Lys Val Lys Val Phe Asp Leu Lys Lys Tyr Leu Ala Ser
565 570 575
Asn Gly Tyr Met Asn Ile Glu Ile Thr Gly Ile Asp Thr Thr Ile Lys
580 585 590
Gly Ser Leu Lys Pro Phe Ile Asp Leu Gln Asn Ile Asp Leu Ser Tyr
595 600 605
Ser Asp Lys Glu Glu Ile Ile Lys Ser Val Thr Ile Phe Gly Asp Asp
610 615 620
Lys Lys Leu Leu Lys Asn Arg Leu Lys Arg Leu Tyr Gly Asp Arg Leu
625 630 635 640
Thr Ala Asp Asp Ile Lys Lys Ile Ser Lys Leu Lys Tyr Thr Gly Trp
645 650 655
Ser Arg Leu Ser Glu Lys Leu Leu Thr Gly Ile Glu Ala Val Leu Pro
660 665 670
Ser Thr Gly Glu Tyr Thr Asn Ile Ile His Ala Leu Trp Glu Thr Asn
675 680 685
Asp Asn Leu Met Gln Leu Leu Ser Ser Asn Tyr Asp Phe Arg Lys Lys
690 695 700
Ile Asp Glu Glu Asn Gly Asp Ser Thr Phe Thr Ser Leu Arg Glu Glu
705 710 715 720
Ile Asp Asn Leu Tyr Val Ser Pro Lys Ile Lys Arg Pro Ile Tyr Gln
725 730 735
Ala Met Gln Ile Val Glu Glu Ile Val Lys Ile Gln Gly His Asp Pro
740 745 750
Lys Lys Ile Phe Ile Glu Val Ala Arg Asp Glu Gly Glu Lys Lys Arg
755 760 765
Thr Val Ser Arg Lys Gln Lys Leu Ile Glu Leu Tyr Lys Tyr Cys Lys
770 775 780
Lys Asp Glu Gln Lys Leu Tyr Glu Gln Leu Cys Asn Thr Asp Glu Asn
785 790 795 800
Asp Phe Arg Arg Asp Ala Leu Tyr Leu Tyr Tyr Thr Gln Leu Gly Lys
805 810 815
Cys Met Tyr Thr Gly Lys Pro Ile Glu Leu Ser Glu Ile Tyr Asn Asn
820 825 830
Asn Ile Tyr Asp Ile Asp His Ile Phe Pro Arg Ser Lys Ile Lys Asp
835 840 845
Asp Ser Leu Asn Asn Arg Val Leu Val Leu Lys Ala Glu Asn Ala Lys
850 855 860
Lys Gly Asn Ile Tyr Pro Ile Asn Ser Ser Ile Arg Asn Asn Met Leu
865 870 875 880
Pro Phe Trp Lys Thr Phe Phe Asp Lys Gly Leu Ile Ser Lys Glu Lys
885 890 895
Phe Glu Arg Leu Val Arg Asn Gln Pro Leu Thr Asp Glu Glu Leu Ser
900 905 910
Ser Phe Val Ser Arg Gln Leu Val Glu Thr Arg Gln Ser Thr Lys Ala
915 920 925
Val Ala Gln Leu Leu Lys Lys Arg Tyr His Asp Thr Ala Val Glu Tyr
930 935 940
Ile Lys Ala Ser Leu Val Ser Asp Phe Arg Gln Glu Asn Asp Phe Val
945 950 955 960
Lys Ser Arg Asp Val Asn Asp Phe His His Ala Lys Asp Ala Tyr Leu
965 970 975
Asn Ile Val Val Gly Asn Val Tyr Thr Val Arg Ser His Leu Ala His
980 985 990
Phe Ile Asp Asn Val Gln Ser Gly Lys Trp Ser Val Asn Lys Met Phe
995 1000 1005
Asp Tyr Thr Thr Asn Gly Ala Trp Ile Val Glu Asn Asn Lys Ser
1010 1015 1020
Ile Asn Ile Val Lys Ser Asn Met Ala Lys Asn Asn Ile Arg Phe
1025 1030 1035
Thr Arg Tyr Ala Phe Lys Gln Thr Gly Gly Leu Phe Asp Gln Asn
1040 1045 1050
Pro Leu Lys Lys Gly Leu Gly Gln Val Pro Arg Lys Lys Asp Met
1055 1060 1065
Asp Ile Asp Lys Tyr Gly Gly Tyr Asn Lys Pro Ser Ser Ala Tyr
1070 1075 1080
Phe Ala Phe Val Glu Tyr Lys Asn Ser Lys Gly Glu Ile Val Arg
1085 1090 1095
Ser Phe Glu Pro Val Asp Leu Tyr Ala Glu Lys Glu Tyr Ile Arg
1100 1105 1110
Asn Pro Gln Glu Phe Ile Lys Arg Lys Leu Gly Val Asp Tyr Val
1115 1120 1125
Lys Ile Ile Ile Pro Cys Val Lys Tyr Asn Ala Leu Ile Ser Ile
1130 1135 1140
Asn Gly Phe Arg Met His Ile Ser Ser Lys Ser Ser Gly Gly Ala
1145 1150 1155
Asn Leu Ile Cys Lys Pro Ala Val Gln Leu Val Val Ser Gln Glu
1160 1165 1170
Gln Glu Lys Tyr Ile Lys Lys Ile Ser Ser Tyr Leu Ala Lys Cys
1175 1180 1185
Ala Glu Leu Arg Lys Glu Lys Glu Ile Thr Ser Phe Asp Gly Ile
1190 1195 1200
Thr Glu Lys Asp Asn Ala Asp Leu Tyr Glu Ala Leu Lys His Lys
1205 1210 1215
Thr Asn His Thr Ile Tyr Asn Val Lys Phe Ser Lys Leu Ala Asn
1220 1225 1230
Ile Leu Ile Asp Lys Gln Cys Glu Phe Glu Gly Leu Ser Leu Tyr
1235 1240 1245
Glu Gln Cys Tyr Ile Leu Met Gln Ile Ile Asn Ile Leu His Ala
1250 1255 1260
Asn Val Met Ser Gly Asp Leu Ser Ala Ile Gly Glu Ala Lys Lys
1265 1270 1275
Ser Gly Val Thr Thr Ile Ser Asn Lys Met Gln Pro Ser Tyr Lys
1280 1285 1290
Thr Val Lys Leu Ile Asn Gln Ser Ile Ser Gly Leu Phe Glu Gln
1295 1300 1305
Glu Ile Asp Leu Leu Lys
1310
<210> 5
<211> 1151
<212> PRT
<213> C6165 protein sequence
<400> 5
Met Lys Arg Asn Gly Thr Thr Ala Glu Gln Lys Ala Glu Asn Tyr Arg
1 5 10 15
Trp Arg Ser Glu Ala Leu Asp Lys Lys Ile Gly Leu Glu Glu Leu Ala
20 25 30
Ile Val Leu Gln Lys Ile Asn Gly Gln Ile His Gly Thr Ser Gly Tyr
35 40 45
Leu Gly Ala Ile Ser Asp Arg Ser Lys Glu Leu Tyr Phe Asn His Gln
50 55 60
Thr Val Gly Gln Tyr Gln Met Gln Leu Leu Glu Lys Asp Pro Asn Thr
65 70 75 80
Ser Leu Lys Asn Gln Val Phe Tyr Arg Gln Asp Tyr Leu Asp Glu Phe
85 90 95
Glu Arg Ile Trp Glu Thr Gln Ala Thr Tyr His Pro Glu Leu Thr Ala
100 105 110
Glu Leu Lys His Glu Ile Arg Asp Ile Val Ile Phe Tyr Gln Arg Gln
115 120 125
Leu Lys Ser Gln Lys Ser Leu Val Gly Tyr Cys Glu Leu Glu Ser Lys
130 135 140
Pro Lys Glu Val Val Val Asp Gly Lys Lys Lys Thr Ile Thr Thr Gly
145 150 155 160
Leu Arg Val Cys Pro Lys Ser Ser Pro Leu Phe Gln Ser Phe Lys Ile
165 170 175
Trp Gln Thr Leu Asn Asn Val Gln Val Ser Gly Asn Ile Ile Pro Glu
180 185 190
Lys Gln Leu Asp Leu Phe Gly Thr Ala Ala Thr Tyr Lys Tyr Gly Ser
195 200 205
Arg Cys Leu Thr Glu Glu Glu Lys Gln Thr Leu Tyr Lys Glu Leu Ser
210 215 220
Leu Lys Glu Arg Met Ser Ala Ala Glu Ile Leu Lys Leu Leu Phe Lys
225 230 235 240
Ser Gly Lys Gly Leu Ser Leu Asn Phe Arg Glu Val Asp Gly Asn His
245 250 255
Thr Met Ala Ala Phe Val Lys Ala Cys Gln Thr Ile Ile Val Met Ser
260 265 270
Gly His Asn Glu Tyr Asp Phe Ala Lys Leu Ser Tyr Glu Thr Val Val
275 280 285
Ser Thr Ile Asn Gln Ile Phe Asn Ala Leu Gly Ile Lys Ser Gly Phe
290 295 300
Leu Asn Phe Asp Pro Cys Leu Glu Gly Lys Ala Phe Glu His Gln Pro
305 310 315 320
Ala Tyr Gln Leu Trp His Leu Leu Tyr Ser Tyr Ala Gly Asp Asn Ser
325 330 335
Ala Thr Gly Asn Glu Lys Leu Ile Thr Arg Ile Ala Glu Thr Phe Gly
340 345 350
Met Glu His Asn Tyr Ala Ala Val Phe Ala Thr Ile Thr Phe Met Pro
355 360 365
Asp Tyr Gly Asn Leu Ser Ala Lys Ala Met Arg Arg Ile Leu Pro Tyr
370 375 380
Met Met Asp Gly Asn Glu Tyr Ser Val Ala Cys Glu Tyr Ala Gly Tyr
385 390 395 400
Arg His Ser Arg Arg Ser Leu Thr Lys Glu Glu Leu Asp Lys Lys Pro
405 410 415
Leu Val Asp Thr Leu Pro Leu Leu Pro Arg Asn Ser Leu Arg Asn Pro
420 425 430
Val Val Glu Lys Ile Leu Asn Gln Met Ile Asn Val Val Asn Glu Val
435 440 445
Ser Ala Gln Tyr Gly Lys Pro Asp Glu Ile Arg Ile Glu Met Ala Arg
450 455 460
Glu Leu Lys Lys Ser Ala Lys Glu Arg Glu Gln Met Thr Ala Asp Ile
465 470 475 480
Thr Arg Ala Thr Ala Gln Gln Glu Glu Tyr Arg Arg Ile Leu Gln Asp
485 490 495
Lys Phe Gly Met Pro His Val Ser Arg Asn Asp Ile Ile Arg Tyr Arg
500 505 510
Leu Trp Leu Glu Leu Lys Asp Asn Gly Tyr Lys Thr Leu Tyr Ser Gln
515 520 525
Thr Tyr Ile Pro Arg Glu Lys Leu Phe Ser Lys Glu Phe Asp Val Glu
530 535 540
His Ile Ile Pro Gln Ala Arg Leu Phe Asp Asp Ser Phe Ser Asn Lys
545 550 555 560
Thr Leu Glu Ala Arg Gln Ala Asn Leu Glu Lys Ser Asn Ala Thr Ala
565 570 575
Phe Asp Tyr Val Gln Ser Lys Tyr Gly Glu Lys Gly Ala Lys Glu Tyr
580 585 590
Lys Glu Arg Ile Asp Tyr Leu Tyr Ala Asp Gly Val Ile Ser Lys Thr
595 600 605
Lys His Asp Lys Leu Leu Met Lys Glu Ala Asp Ile Pro Glu Gly Phe
610 615 620
Ile Asn Arg Asp Leu Arg Asp Ser Gln Tyr Ile Ala Arg Lys Ala Arg
625 630 635 640
Glu Ile Leu Glu Ser Met Val Arg Val Val Val Pro Thr Thr Gly Ala
645 650 655
Val Thr Asp Arg Leu Arg Asp Asp Trp Gln Leu Val Asp Ile Met Lys
660 665 670
Glu Leu Asn Trp Asp Lys Tyr Asn Lys Leu Gly Leu Thr Glu Thr Phe
675 680 685
Lys Asp His Asp Gly Arg Gln Ile Lys Arg Ile Lys Asp Trp Thr Lys
690 695 700
Arg Asn Asp His Arg His His Ala Met Asp Ala Leu Thr Ile Ala Phe
705 710 715 720
Thr Gln His Ser Phe Ile Gln Tyr Leu Asn Asn Leu Asn Ala Arg Ser
725 730 735
Asn Lys Ser Gly Ser Ile Tyr Ala Ile Glu Gln Thr Cys Leu Tyr Arg
740 745 750
Asp Ala His Gly Lys Leu Arg Phe Met Pro Pro Met Pro Leu Asp Val
755 760 765
Phe Arg Ala Glu Ala Arg Arg Gln Leu Gln Asp Ile Leu Val Ser Thr
770 775 780
Lys Ala Lys Asn Lys Val Val Thr Arg Asn Ile Asn Ser Ile Lys Arg
785 790 795 800
Arg Gly Glu Lys Lys Gln Thr Val Gln Leu Thr Pro Arg Cys Gln Leu
805 810 815
His Asn Glu Thr Val Tyr Gly Ser Val Arg Cys Cys Val Val Lys Glu
820 825 830
Glu Lys Ile Ser Ser Ser Phe Thr Glu Glu Lys Ile Ala Thr Val Ala
835 840 845
Ser Pro Arg Tyr Arg Glu Ala Leu Leu Lys Arg Leu Gly Glu Asn Gly
850 855 860
Gly Asn Ala Lys Lys Ala Phe Thr Gly Arg Asn Ser Leu Glu Lys Asn
865 870 875 880
Pro Leu Tyr Leu Asn Asp Asp His Thr Leu Ser Val Pro Ala Lys Val
885 890 895
Lys Thr Val Thr Tyr Glu Thr Ile Phe Thr Gln Arg Lys Pro Ile Asp
900 905 910
Lys Asp Leu Lys Val Asp Lys Val Ile Asp Glu His Val Lys Glu Ile
915 920 925
Leu Glu Ala Arg Leu Lys Glu Phe Gly Gly Asp Ala Lys Lys Ala Phe
930 935 940
Ser Asn Leu Glu Glu Asn Pro Ile Trp Leu Asn Arg Glu Arg Gly Ile
945 950 955 960
Gln Ile Lys Arg Val Thr Ile Arg Gly Val Ser Asn Ala Val Ala Leu
965 970 975
His Asp Lys His Asp Val Gln Gly Lys Pro Met Cys Asp Ser Glu Gly
980 985 990
Arg Arg Met Pro Ser Asp Tyr Val Ser Thr Ser Asn Asn His His Val
995 1000 1005
Ala Ile Phe Arg Asp Ala Asn Gly Asn Leu Gln Glu His Val Val
1010 1015 1020
Ser Tyr Phe Glu Ala Thr Ala Arg Ala Ile Gln His Leu Pro Ile
1025 1030 1035
Val Asp Arg Asp Tyr Asn Lys Asp Glu Gly Trp Gln Phe Leu Phe
1040 1045 1050
Thr Met Lys Arg Asn Glu Tyr Phe Val Phe Pro Asn Glu Lys Thr
1055 1060 1065
Gly Phe Asp Pro Lys Glu Ile Asp Leu Leu Asp Pro Lys Asn Tyr
1070 1075 1080
Ala Val Ile Ser Ala Asn Leu Phe Arg Val Gln Lys Leu Ala Thr
1085 1090 1095
Lys Asn Tyr Phe Phe Arg His His Leu Glu Thr Asn Val Glu Thr
1100 1105 1110
Pro Lys Glu Leu Ser Gly Ile Thr Tyr Lys Ser Gln Leu Gly Leu
1115 1120 1125
Lys Gly Ile Ala Gly Ile Val Lys Val Arg Val Asn Asn Ile Gly
1130 1135 1140
Gln Ile Val Ala Val Gly Glu Tyr
1145 1150
<210> 6
<211> 1372
<212> PRT
<213> Lt1Cas9 protein sequence
<400> 6
Met Cys Thr Lys Glu Ser Glu Lys Leu Asn Lys Asn Ala Asp Tyr Tyr
1 5 10 15
Ile Gly Leu Asp Met Gly Thr Ser Ser Ala Gly Trp Ala Val Ser Asp
20 25 30
Ser Glu Tyr Asn Leu Ile Arg Arg Lys Gly Lys Asp Leu Trp Gly Val
35 40 45
Arg Gln Phe Glu Glu Ala Lys Thr Ala Ala Glu Arg Arg Gly Phe Arg
50 55 60
Val Ala Arg Arg Arg Lys Gln Arg Gln Gln Val Arg Asn Arg Leu Leu
65 70 75 80
Ser Glu Glu Phe Gln Asn Glu Ile Thr Lys Ile Asp Ser Gly Phe Leu
85 90 95
Lys Arg Met Glu Asp Ser Arg Phe Val Ile Ser Asp Lys Arg Val Pro
100 105 110
Glu Lys Tyr Thr Leu Phe Asn Asp Ser Gly Tyr Thr Asp Val Glu Tyr
115 120 125
Tyr Asn Gln Tyr Pro Thr Ile Tyr His Leu Arg Lys Ala Leu Ile Glu
130 135 140
Ser Asn Glu Arg Phe Asp Ile Arg Leu Val Phe Leu Gly Ile His Ser
145 150 155 160
Leu Phe Gln His Pro Gly His Phe Leu Asp Lys Gly Asp Val Asp Thr
165 170 175
Asp Asn Thr Gly Pro Glu Glu Leu Ile Gln Phe Leu Glu Asp Cys Met
180 185 190
Asn Glu Ile Gln Ile Ser Ile Pro Leu Val Ser Asn Gln Lys Val Leu
195 200 205
Thr Asp Ile Leu Thr Asp Ser Arg Ile Thr Arg Arg Asp Lys Glu Gln
210 215 220
Gln Ile Leu Glu Ile Leu Gln Pro Asn Lys Glu Ser Lys Lys Ala Val
225 230 235 240
Ser Gln Phe Val Lys Val Leu Thr Gly Gln Lys Ala Lys Leu Gly Asp
245 250 255
Leu Ile Met Met Glu Asp Lys Asp Thr Glu Glu Tyr Lys Tyr Ser Phe
260 265 270
Ser Phe Arg Glu Lys Thr Leu Glu Glu Ile Leu Pro Asp Ile Glu Gly
275 280 285
Val Ile Asp Gly Leu Ala Leu Glu Tyr Ile Glu Ser Ile Tyr Ser Leu
290 295 300
Tyr Ser Trp Ser Leu Leu Asn Ser Tyr Met Lys Asp Thr Leu Thr Gly
305 310 315 320
His Tyr Tyr Ser Tyr Leu Ala Glu Ala Arg Val Ala Ala Tyr Asp Lys
325 330 335
His His Ser Asp Leu Val Lys Leu Lys Thr Leu Phe Arg Glu Tyr Ile
340 345 350
Pro Glu Glu Tyr Asp Asn Phe Phe Arg Lys Met Glu Lys Ala Asn Tyr
355 360 365
Ser His Tyr Ile Gly Ser Thr Glu Tyr Asp Gly Glu Lys Arg Cys Arg
370 375 380
Thr Ala Lys Ala Lys Gln Glu Asp Phe Tyr Lys Ser Ile Asn Lys Met
385 390 395 400
Leu Glu Lys Ile Pro Glu Cys Ser Glu Lys Thr Glu Ile Gln Lys Glu
405 410 415
Ile Ile Glu Gly Thr Phe Leu Leu Lys Gln Thr Gly Pro Gln Asn Gly
420 425 430
Phe Val Pro Asn Gln Leu Gln Leu Lys Glu Leu Arg Lys Ile Leu Gln
435 440 445
Asn Ala Ser Lys His Tyr Pro Phe Leu Thr Glu Lys Asp Glu Arg Asp
450 455 460
Met Thr Ala Ile Asp Arg Ile Glu Ala Leu Phe Ser Phe Arg Ile Pro
465 470 475 480
Tyr Tyr Ile Gly Pro Leu Lys Asn Thr Asp Asn Gln Gly His Gly Trp
485 490 495
Ala Val Arg Arg Asp Gly His Glu Gln Ile Pro Val Arg Pro Trp Asn
500 505 510
Phe Glu Glu Ile Ile Asp Glu Ser Ala Ser Ala Asp Leu Phe Ile Lys
515 520 525
Asn Leu Val Asn Ser Cys Thr Tyr Leu Arg Thr Glu Lys Val Leu Pro
530 535 540
Lys Ser Ser Leu Leu Tyr Gln Glu Phe Glu Val Leu Asn Glu Leu Asn
545 550 555 560
Asn Leu Arg Ile Asn Gly Met Tyr Pro Asp Glu Ile Gln Pro Gly Leu
565 570 575
Lys Arg Met Ile Phe Glu Gln Cys Phe Tyr Ser Gly Lys Lys Val Thr
580 585 590
Gly Lys Lys Leu Gln Leu Phe Leu Arg Ser Val Leu Thr Asn Ser Ser
595 600 605
Thr Glu Glu Phe Val Leu Thr Gly Ile Asp Lys Asp Phe Lys Ser Ser
610 615 620
Leu Ser Ser Tyr Lys Lys Phe Cys Glu Leu Phe Gly Val Lys Thr Leu
625 630 635 640
Asn Asp Thr Gln Lys Val Met Ala Glu Gln Ile Ile Glu Trp Ser Thr
645 650 655
Val Tyr Gly Asp Ser Arg Lys Phe Leu Lys Arg Lys Leu Glu Asp Asn
660 665 670
Tyr Pro Glu Leu Thr Asp Gln Gln Ile Arg Arg Ile Ala Gly Phe Lys
675 680 685
Phe Ser Glu Trp Gly Asn Leu Ser Arg Ala Phe Leu Glu Met Glu Gly
690 695 700
Tyr Lys Asp Glu Ala Gly Asn Pro Val Thr Ile Ile Arg Ala Leu Arg
705 710 715 720
Asp Thr Gln Lys Asn Leu Met Gln Leu Leu Ser Asn Asp Ser Ala Phe
725 730 735
Ala Lys Lys Leu Gln Glu Leu Asn Asp Tyr Val Thr Arg Asp Ile Trp
740 745 750
Ser Ile Glu Pro Asp Asp Leu Asp Gly Met Tyr Leu Ser Ala Pro Val
755 760 765
Arg Arg Met Ile Trp Gln Thr Phe Leu Ile Leu Arg Glu Val Val Asp
770 775 780
Thr Ile Gly Tyr Ser Pro Lys Lys Ile Phe Met Glu Met Ala Arg Gly
785 790 795 800
Glu Gln Glu Lys Lys Arg Thr Ala Ser Arg Lys Lys Gln Leu Ile Asp
805 810 815
Leu Tyr Lys Glu Ala Gly Met Lys Asn Asp Glu Leu Phe Gly Asp Leu
820 825 830
Glu Ser Leu Glu Glu Ala Gln Leu Arg Ser Lys Lys Leu Tyr Leu Tyr
835 840 845
Phe Arg Gln Met Gly Arg Asp Ile Tyr Ser Gly Lys Leu Ile Asp Phe
850 855 860
Met Asp Val Leu His Gly Asn Arg Tyr Asp Ile Asp His Ile His Pro
865 870 875 880
Gln Ser Lys Lys Lys Asp Asp Ser Leu Glu Asn Asn Leu Val Leu Thr
885 890 895
Ser Lys Asp Phe Asn Asn His Ile Lys Gln Asp Val Tyr Pro Ile Pro
900 905 910
Glu Gln Ile Gln Ser Arg Gln Lys Gly Phe Trp Ala Met Leu Leu Lys
915 920 925
Gln Gly Phe Met Ser Gln Glu Lys Tyr Asn Arg Leu Met Arg Thr Thr
930 935 940
Pro Phe Thr Asp Glu Glu Leu Ala Glu Phe Val Asn Arg Gln Leu Val
945 950 955 960
Glu Thr Arg Gln Gly Thr Lys Ala Ile Ile Ser Leu Ile Asn Gln Cys
965 970 975
Phe Pro Asp Ser Glu Val Val Tyr Val Lys Ala Gly Asn Thr Ser Asp
980 985 990
Phe Arg Gln Arg Phe Asp Ile Pro Lys Ser Arg Asp Leu Asn Asn Tyr
995 1000 1005
His His Ala Val Asp Ala Tyr Leu Asn Ile Val Val Gly Asn Val
1010 1015 1020
Tyr Asp Thr Lys Phe Thr Lys Asn Pro Ile Asn Phe Ile Lys Lys
1025 1030 1035
Met Arg Lys Ser Gly Asn Leu His Ser Tyr Ser Leu Arg Arg Met
1040 1045 1050
Tyr Asp Phe Asn Val Gln Arg Gly Asp Gln Thr Ala Trp Val Ala
1055 1060 1065
Glu Asn Asp Thr Thr Leu Lys Thr Val Lys Lys Thr Ala Phe Lys
1070 1075 1080
Thr Ser Pro Met Val Thr Lys Arg Thr Tyr Glu Arg Lys Gly Gly
1085 1090 1095
Leu Ala Asp Ser Val Leu Ile Ala Ala Lys Lys Ala Lys Pro Gly
1100 1105 1110
Val His Leu Pro Val Lys Thr Ser Asp Ser Arg Phe Ala Asn Gln
1115 1120 1125
Val Ser Thr Tyr Gly Gly Tyr Asp Asn Val Lys Gly Ser His Phe
1130 1135 1140
Phe Leu Val Glu His Gln Gln Lys Lys Lys Thr Ile Arg Ser Ile
1145 1150 1155
Glu Asn Val Pro Ile His Leu Lys Glu Lys Leu Lys Thr Lys Glu
1160 1165 1170
Glu Leu Glu His Tyr Cys Ala Gln Val Leu Gly Met Val Gln Pro
1175 1180 1185
Asp Val Arg Leu Thr Arg Ile Pro Met Tyr Ser Leu Leu Leu Ile
1190 1195 1200
Asp Gly Tyr Tyr Tyr Tyr Leu Thr Gly Arg Thr Gly Gly Asn Leu
1205 1210 1215
Ser Leu Ser Asn Ala Val Glu Leu Cys Leu Pro Ala Lys Glu Gln
1220 1225 1230
Ala His Ile Arg Met Ile Ser Lys Ile Ala Gly Gly Arg Ser Thr
1235 1240 1245
Asp Ala Leu Ser Ala Glu Ala Lys Asp Asp Phe Arg Lys Lys Asn
1250 1255 1260
Leu Arg Leu Tyr Asp Glu Leu Ala Glu Lys His Arg Ser Thr Ile
1265 1270 1275
Phe Ser Lys Arg Lys Asn Pro Ile Gly Pro Lys Leu Leu Lys Tyr
1280 1285 1290
Arg Glu Ala Phe Val Lys Gln Thr Ile Glu Asn Gln Cys Lys Val
1295 1300 1305
Ile Leu Gln Ile Leu Lys Leu Thr Ser Thr Asn Cys Lys Thr Ser
1310 1315 1320
Ala Asp Leu Lys Leu Ile Gly Gly Ser Gly Gln Glu Gly Val Met
1325 1330 1335
Ser Ile Ser Lys Leu Leu Arg Ala Glu Lys Tyr Ala Glu Phe Tyr
1340 1345 1350
Leu Ile Cys Gln Ser Pro Ser Gly Ile Tyr Glu Thr Arg Lys Asn
1355 1360 1365
Leu Leu Thr Ile
1370
<210> 7
<211> 1351
<212> PRT
<213> Lt2Cas9 protein sequence
<400> 7
Met Lys Lys Gln Phe Gly Glu Tyr Tyr Leu Ala Phe Asp Ile Gly Thr
1 5 10 15
Asp Ser Val Gly Trp Ala Val Thr Asp Leu Asn Tyr Asn Leu Glu Arg
20 25 30
Leu Asn Gly Lys Tyr Met Trp Gly Thr Arg Leu Phe Glu Ala Gly Lys
35 40 45
Thr Ala Ala Glu Arg Arg Ser Phe Arg Ala Ala Arg Arg Arg Leu Gln
50 55 60
Arg Arg Thr Gln Arg Ile Arg Leu Leu Gln Glu Ile Phe Ala Glu Glu
65 70 75 80
Ile Ser Lys Ala Asp Met Gly Phe Phe His Arg Leu Ala Glu Ser Lys
85 90 95
Tyr Arg Pro Glu Asp Lys Asn Asn Gln Thr Asn Thr Leu Phe Asn Asp
100 105 110
Ala Asn Phe Lys Asp Lys Asp Tyr His Ala Lys Phe Pro Thr Ile Phe
115 120 125
His Leu Arg Lys Thr Leu Ile Glu Ser Lys Glu Lys Phe Asp Val Arg
130 135 140
Leu Val Tyr Leu Ala Leu Gln His Ile Leu Lys Asn Arg Gly His Phe
145 150 155 160
Leu Phe Glu Gly Gln Arg Met Glu Asn Val Thr Ser Phe Asp Ala Ala
165 170 175
Phe Ser Pro Leu Asn Asn Tyr Leu Glu Asp Glu Tyr Gln Phe Ala Phe
180 185 190
Ala Asp Asp Ser Leu Asp Lys Val Gln Asn Ile Leu Lys Asp Ser Ser
195 200 205
Leu Gly Val Lys Asp Lys Asn Lys Tyr Leu Asn Asp Leu Leu Gly Ala
210 215 220
Asp Thr Lys Gln Lys Lys Glu Met Val Asn Leu Ile Cys Gly Gly Thr
225 230 235 240
Ala Lys Ile Lys Asn Ile Leu Asp Ile Val Glu Asp Thr Asp Leu Glu
245 250 255
Ile Asp Lys Leu Cys Phe Lys Thr Met Asp Tyr Ala Asp Ile His Asp
260 265 270
Lys Leu Leu Asp Thr Ile Gly Asp Leu Ser Lys Ile Glu Cys Leu Asp
275 280 285
Arg Leu Lys Ser Ile Phe Asp Trp Ser Leu Leu Ala Glu Ile Lys Lys
290 295 300
Gly Asn Asp Tyr Leu Ser Phe Ala Lys Val Asp Val Tyr Glu Lys His
305 310 315 320
Lys Gln Asp Ile Lys Leu Leu Lys Ser Leu Ile Lys Lys Tyr Tyr Asp
325 330 335
Asn Ala Ala Tyr Asn Glu Ile Phe Asn Asp Val Asn Thr Ala Asn Asn
340 345 350
Tyr Val Ala Tyr Val Gly Met Thr Lys Lys Asn Asn Lys Lys Gln Val
355 360 365
Val Leu Ser Arg Lys Cys Thr Gln Glu Glu Phe Cys Lys Phe Val Lys
370 375 380
Gly Tyr Leu Asp Arg Ile Lys Ser Met Asp Ala Asp Val Glu Gln Leu
385 390 395 400
Lys Ser Lys Cys Ala Asn Val Ser Phe Ala Pro Lys Gln Ile Asn Arg
405 410 415
Asp Asn Gly Val Ile Pro Tyr Gln Met His Leu Leu Glu Leu Glu Lys
420 425 430
Ile Leu Glu Asn Ala Lys Arg Tyr Leu Pro Phe Leu Asn Glu Lys Asp
435 440 445
Ala Ser Gly Tyr Thr Ala Ala Glu Lys Ile Val Lys Ile Met Thr Phe
450 455 460
Arg Ile Pro Tyr Tyr Val Gly Pro Leu Asn Asp His His Lys Ile Ala
465 470 475 480
Asp Gly Gln Lys Gly His Cys Trp Ile Val Lys Asn Ser Asp Glu Lys
485 490 495
Ile Arg Pro Trp Asn Phe Glu Arg Val Val Asp Gln Glu Ala Cys Ala
500 505 510
Glu Ala Phe Ile Arg Arg Met Thr Asn Lys Cys Thr Tyr Met His Asp
515 520 525
Ala Asp Val Leu Pro Lys Asp Ser Leu Leu Tyr Ser Lys Phe Val Val
530 535 540
Leu Asn Glu Leu Asn Asn Val Cys Val Asn Gly Asp Arg Leu Pro Thr
545 550 555 560
Asp Leu Lys Gln His Ile Tyr Arg Glu Leu Phe Met Gln Gln Lys Lys
565 570 575
Val Lys Ala Lys Asn Phe Arg Asp Phe Met Ile Asn Asn Gly Tyr Met
580 585 590
Arg Glu Thr Asp Ser Leu Ser Gly Phe Asp Gly Asp Phe Lys Gly Ser
595 600 605
Leu Gly Ser Leu Ile Asp Phe Arg Lys Ile Ile Gly Val Lys Ala Gly
610 615 620
Asp Asn Lys Met Val Glu Glu Ile Ile Lys Trp Ile Val Leu Phe Gly
625 630 635 640
Asp Thr Lys Lys Leu Leu Lys Asp Arg Ile Asn Lys Phe Tyr Gly Asp
645 650 655
Lys Leu Ser Glu Asn Glu Ile Lys Ala Ile Val Asn Leu Lys Tyr Thr
660 665 670
Gly Trp Gly Arg Leu Ser Arg Glu Phe Leu Glu Gln Ile Thr Ser Gln
675 680 685
Ile Pro Gly Phe Ala Asn Glu Leu Gly Ile Ile Thr Ala Met Tyr Glu
690 695 700
Thr Pro Asn Asn Leu Met Glu Leu Leu Ser Gly Gln Tyr Gln Tyr Leu
705 710 715 720
Asp Lys Leu Gln Ala Tyr Asn Asn Thr Met His Glu Ala Thr Gly Ser
725 730 735
Leu Thr Tyr Glu Thr Val Ala Asp Leu Tyr Val Ser Pro Ala Val Lys
740 745 750
Arg Ser Ile Trp Gln Thr Leu Val Leu Ala Glu Glu Ile Lys Asn Val
755 760 765
Met Gly His Glu Pro Lys Lys Val Phe Ile Glu Met Thr Arg Ser Asp
770 775 780
Val Gln Asn Lys Lys Arg Thr Val Ser Arg Lys Asn Ser Leu Ile Ser
785 790 795 800
Leu Tyr Lys Ala Cys Lys Gly Glu Ala Arg Asp Trp Leu Val Glu Leu
805 810 815
Glu Asn His Thr Asp Gly Asp Leu Arg Gly Asp Lys Leu Phe Leu Tyr
820 825 830
Tyr Thr Gln Met Gly Arg Cys Met Tyr Thr Gly Glu Arg Ile Glu Ile
835 840 845
Gly Glu Leu Phe Ser Thr Gly Ala Asp Gly Arg Ala Leu Tyr Asp Ile
850 855 860
Glu His Ile Phe Pro Gln Ser Lys Ile Lys Asp Asp Ser Ile Asp Asn
865 870 875 880
Arg Val Leu Val Lys Ala Gly Ala Asn Arg Ala Lys Gly Asp Gln Tyr
885 890 895
Pro Val Pro Gln Glu Tyr Arg Asn Lys Cys Gly Gly Ile Trp Lys Met
900 905 910
Leu Leu Glu Arg Gly Phe Ile Ser Gln Lys Lys Tyr Asp Arg Leu Val
915 920 925
Arg Asn Thr Pro Phe Ser Glu Asp Glu Leu Ala Gly Phe Ile Ala Arg
930 935 940
Gln Ile Val Glu Thr Ser Gln Ser Thr Lys Ala Val Ala Glu Ile Leu
945 950 955 960
Lys Arg Glu Tyr Lys Asn Thr Glu Ile Val Tyr Val Lys Ala Gly Asn
965 970 975
Val Ser Ala Phe Arg Gln Gln Asn Asp Phe Val Lys Cys Arg Glu Ile
980 985 990
Asn Asp Tyr His His Ala Lys Asp Ala Tyr Leu Asn Ile Val Val Gly
995 1000 1005
Asn Val Tyr Asn Thr Lys Phe Thr His Ser Pro Ile Asn Phe Ile
1010 1015 1020
Lys Ser Lys Asn Asn Lys Tyr Ser Leu Asn Lys Met Tyr Asp Phe
1025 1030 1035
Lys Val Glu Arg Asn Gly Ser Val Ala Trp Glu Pro Gly Asp Asn
1040 1045 1050
Gly Thr Ile Val Thr Val Lys Arg Thr Met His Asn Asn Arg Val
1055 1060 1065
Leu Phe Thr Arg Tyr Ala Ser Gln Ala Lys Gly Lys Leu Phe Glu
1070 1075 1080
Val Thr Leu Leu Lys Lys Gly Lys Gly Lys His Pro Ile Lys Ser
1085 1090 1095
Asp Met Asn Asp Pro Ile Ser Asn Ile Glu Asn Tyr Gly Gly Tyr
1100 1105 1110
Asp Ser Ile Phe Gly Ala Tyr Phe Phe Leu Val Glu His Thr Leu
1115 1120 1125
Lys Asn Lys Arg Ile Arg Thr Ile Glu Tyr Val Pro Ile Leu Lys
1130 1135 1140
Ala Ala Met Leu Gln Gly Asn Ala Ala Ala Leu Leu Gln Tyr Cys
1145 1150 1155
Leu Asp Asp Leu Asn Leu His Glu Pro Arg Ile Leu Leu Ala Glu
1160 1165 1170
Ile Lys Phe Asn Thr Leu Phe Lys Val Asp Gly Phe Lys Met His
1175 1180 1185
Leu Ser Ala Arg Gln Val Asp Gln Leu Val Tyr Lys Gly Ala Glu
1190 1195 1200
Gln Leu Val Leu Ser Glu Lys Asp Glu Lys Tyr Phe Lys Lys Ile
1205 1210 1215
Ala Lys Tyr Ile Gln Arg Asn Lys Glu Ala Lys Gly Lys Leu Thr
1220 1225 1230
Leu Thr Asp Phe Asp Lys Ile Asp Lys Asp Asn Asn Val Glu Leu
1235 1240 1245
Tyr Asp Arg Leu Leu Glu Lys Leu Lys Asn Thr Val Tyr Gly Val
1250 1255 1260
Arg Leu Gly Glu Gln Ile Lys Lys Leu Glu Lys Gly Arg Glu Ile
1265 1270 1275
Phe Ile Asn Leu Gly Ile Glu Glu Gln Val Ile Thr Leu Phe Glu
1280 1285 1290
Ile Leu His Leu Phe Gln Cys Asn Arg Ile Lys Ser Asn Leu Thr
1295 1300 1305
Lys Val Gly Gly Ser Ala Asn Ala Gly Thr Leu Leu Thr Ala Lys
1310 1315 1320
Lys Ile Ser Val Leu Lys Lys Ile Ser Ile Ile Asn Gln Ser Pro
1325 1330 1335
Thr Gly Leu Phe Glu Gln Glu Ile Asp Leu Leu Lys Leu
1340 1345 1350
<210> 8
<211> 1353
<212> PRT
<213> Lt3Cas9 protein sequence
<400> 8
Met Gln Thr Lys Lys Val Asp Glu Tyr Tyr Val Gly Phe Asp Ile Gly
1 5 10 15
Thr Asn Ser Val Gly Tyr Ala Val Thr Asp Lys Asn Tyr Asn Leu Ile
20 25 30
Lys His Gly Gly Glu Pro Met Trp Gly Ser His Val Phe Glu Ala Ala
35 40 45
Ser Thr Ala Gln Glu Arg Arg Thr Phe Arg Thr Ala Arg Arg Arg Asn
50 55 60
Asp Arg Lys Lys Gln Arg Ile Ala Leu Val Ser Glu Ile Phe Ala Pro
65 70 75 80
Glu Ile Ala Lys Val Asp Pro Arg Phe Phe Ile Arg Arg Arg Glu Ser
85 90 95
Ala Leu Phe Arg Asp Asp Val Asp Ile Lys Asp Arg Tyr Val Val Phe
100 105 110
Asn Asp Asp Asp Phe Thr Asp Lys Asp Tyr Tyr Asp Ile Tyr Pro Thr
115 120 125
Ile His His Leu Ile Tyr Asp Leu Met Ser Asn Lys Glu Lys His Asp
130 135 140
Ile Arg Leu Val Tyr Met Ala Cys Ala Tyr Leu Val Ala His Arg Gly
145 150 155 160
His Phe Leu Ser Glu Val Ser Lys Asp Asn Ile Glu Asp Val Leu Asp
165 170 175
Phe Asp Val Val Tyr Cys Asn Phe Leu Asn Val Met Asp Asn Tyr Ala
180 185 190
Glu Ile Pro Trp Lys Cys Asp Ile Ser Lys Phe Lys Glu Ile Leu Lys
195 200 205
Lys Lys Gln Thr Val Thr Asn Lys Glu Arg Glu Phe Leu Gln Leu Leu
210 215 220
Asn Glu Gly Lys Lys Phe Lys Thr Ser Glu Glu Asp Asp Val Ser Arg
225 230 235 240
Glu Gly Leu Val Lys Leu Leu Ser Gly Gly Thr Tyr Glu Leu Gly Lys
245 250 255
Leu Phe Pro Lys Leu Thr Phe Glu Glu Lys Val Ser Val Ser Phe Asn
260 265 270
Met Ala Glu Glu Asp Phe Ala Met Val Leu Gln Gln Leu Gly Asp Glu
275 280 285
Gly Asp Ile Ile Ser Ser Leu Arg Asn Val Tyr Asp Trp Ala Ile Leu
290 295 300
Ser Asp Val Leu Asn Gly Lys Asn Ser Val Ser Glu Gly Lys Ile Thr
305 310 315 320
Val Tyr Glu Gln His Lys Lys Asp Leu Ser Phe Leu Lys Tyr Phe Val
325 330 335
Lys Lys Tyr Ile Pro Asn Arg Tyr Tyr Glu Val Phe Arg Asp Gly Asn
340 345 350
Ile Val Gly Asn Tyr Val Ser Tyr Ser Tyr Asn Leu Lys Asn Val Gln
355 360 365
Asn Val Ser Lys Phe Lys Gly Ala Lys Lys Asp Val Phe Cys Asp Tyr
370 375 380
Ile Lys Lys Val Val Lys Asp Ile Lys Val Asp Asp Glu Asp Lys Val
385 390 395 400
Glu Tyr Glu Asp Met Met Phe Arg Leu Asp Thr Tyr Ser Phe Ile Pro
405 410 415
Lys Gln Val Glu Asn Asp Asn Arg Val Ile Pro Tyr Gln Leu Tyr Tyr
420 425 430
Tyr Glu Leu Lys Arg Ile Leu Asp Asn Ala Ser Ser Tyr Leu Glu Phe
435 440 445
Leu Asp Glu Lys Asp Met Asp Gly Tyr Thr Ser Arg Glu Lys Leu Leu
450 455 460
Ser Ile Met Glu Phe Arg Ile Pro Tyr Tyr Val Gly Pro Leu Arg Thr
465 470 475 480
Asp Asn Gly Gln His Gly Trp Met Lys Arg Lys Ala Glu Gly Arg Ile
485 490 495
Tyr Pro Trp Asn Phe Glu Asp Lys Val Asp Leu Asp Ala Ser Glu Gln
500 505 510
Glu Phe Ile Asn Arg Met Thr Asn Ser Cys Thr Tyr Leu Pro Gly Glu
515 520 525
Thr Val Val Pro Lys Tyr Ser Leu Leu Tyr Cys Lys Phe Asn Val Leu
530 535 540
Asn Glu Ile Asn Asn Ile Lys Ile Asn Asp Cys Ser Ile Pro Ile Glu
545 550 555 560
His Lys Gln Gly Ile Tyr Lys Leu Phe Glu Arg Tyr Arg Lys Val Thr
565 570 575
Pro Lys Lys Ile Lys Asp Phe Leu Ile Ser Asn Asn Leu Leu His Pro
580 585 590
Glu Asp Val Ile Ser Gly Ile Asp Val Thr Ile Lys Ser Ser Leu Lys
595 600 605
Ser Tyr His Asp Phe Lys Lys Leu Leu Glu Ser Cys Val Leu Lys Glu
610 615 620
Asn Gln Val Glu Ala Ile Ile Glu Arg Leu Thr Tyr Ser Glu Asp Lys
625 630 635 640
Gly Arg Ile Leu Arg Trp Leu His Met Glu Phe Pro Asp Leu Ser Asp
645 650 655
Asp Asp Val Lys Tyr Ile Ser Lys Leu Lys Tyr Ser Asp Phe Gly Arg
660 665 670
Leu Ser Arg Lys Leu Leu Val Gly Ile Arg Gly Cys Asn Lys Asp Thr
675 680 685
Gly Glu Val Asp Ser Ile Met Gly Met Leu Trp Ser Thr Asn Asp Asn
690 695 700
Met Met Lys Leu Leu Ser Asn Ser Tyr Thr Phe Ile Glu Glu Ile Glu
705 710 715 720
Ala Ile Lys Asn Glu Tyr Tyr Val Glu His Pro Ala Asn Leu Asp Ser
725 730 735
Met Leu Asp Glu Met Tyr Val Ser Asn Ala Val Arg Arg Pro Ile His
740 745 750
Arg Thr Leu Asp Ile Leu Ser Asp Ile Arg Lys Val Cys Gly Lys Asn
755 760 765
Pro Ser Lys Ile Phe Val Glu Met Ala Arg Gly Gly Gly Glu Lys Gly
770 775 780
Val Arg Thr Lys Ser Arg Arg Asp Gln Ile Ser Glu Leu Tyr Lys Asn
785 790 795 800
Met Asp Lys Ala Glu Val Arg Glu Leu Ser Glu Gln Leu Glu Gly Lys
805 810 815
Thr Asp Asn Glu Leu Gln Ser Glu Val Leu Phe Leu Tyr Phe Met Gln
820 825 830
Leu Gly Lys Cys Ala Tyr Thr Gln Lys Thr Ile Asp Ile Asp Lys Leu
835 840 845
Lys Thr Asn Ile Tyr Asn Val Asp His Ile Tyr Pro Gln Ser Tyr Val
850 855 860
Lys Asp Asp Ser Ile Thr Asn Lys Val Leu Val Ile Ser Glu Glu Asn
865 870 875 880
Gly Gln Lys Gly Asp Lys Tyr Pro Ile Ser Lys Asp Ile Arg Glu Lys
885 890 895
Met Gln Pro Phe Trp Tyr Arg Leu Leu Ser Asn Lys Leu Ile Ser Glu
900 905 910
Glu Lys Tyr Arg Arg Leu Thr Arg Cys Thr Ser Phe Thr Glu Glu Glu
915 920 925
Leu Thr Gly Phe Ile Asn Arg Gln Leu Val Glu Thr His Gln Ser Thr
930 935 940
Lys Ala Val Thr Thr Val Phe Arg Thr Leu Phe Pro Asp Val Glu Ile
945 950 955 960
Val Tyr Ser Lys Ala Gly Leu Val Ser Glu Phe Arg Lys Glu Phe Asp
965 970 975
Met Leu Lys Thr Arg Ser Val Asn Asp Leu His His Ala Lys Asp Ala
980 985 990
Tyr Leu Asn Ile Val Val Gly Asn Val Tyr His Cys Arg Phe Thr Lys
995 1000 1005
Asn Phe Tyr Ile Thr Gln Lys Tyr Ser Leu Lys Thr Lys Thr Leu
1010 1015 1020
Phe Thr His Ser Val Lys Leu Gly Asp Asp Val Ile Trp Asn Gly
1025 1030 1035
Gln Glu Ser Ile Gly Asn Val Arg Lys Val Leu Ala Lys Asn Asn
1040 1045 1050
Ile His Tyr Thr Lys Tyr Pro Phe Met Arg Lys Gly Gly Leu Phe
1055 1060 1065
Asp Gln Met Pro Val Lys Ala Ala Ala Gly Leu Ile Pro Arg Lys
1070 1075 1080
Thr Gly Leu Asp Thr Glu Lys Tyr Gly Gly Tyr Asn Lys Ser Thr
1085 1090 1095
Ala Thr Ala Phe Leu Leu Val Lys Tyr Lys Glu Lys Gly Lys Gln
1100 1105 1110
Glu Ala Met Ile Met Pro Val Asp Tyr Met Tyr Ser Glu Lys Val
1115 1120 1125
Phe Ser Asp Asn Glu Tyr Ala Leu Lys Tyr Ser Lys Glu Asn Ile
1130 1135 1140
Lys Lys Ile Trp Gly Arg Thr Glu Asp Gln Val Ile Asp Val Ser
1145 1150 1155
Leu Pro Leu Gly Leu Arg Pro Ile Lys Ile Asn Thr Met Leu Ser
1160 1165 1170
Phe Asp Gly Phe Arg Ala Cys Ile Thr Gly Lys Ala Asn Ala Gly
1175 1180 1185
Gln Lys Ile Gly Phe Thr Ser Met Met Pro Leu Val Ile Gly Asn
1190 1195 1200
Glu Trp Glu Asn Tyr Ile Lys Lys Ile Asp Asn Tyr Ile Glu Lys
1205 1210 1215
Lys Gly Lys Asn Lys Asn Ile Thr Leu Asn Glu Lys Asn Asp Gly
1220 1225 1230
Ile Cys Gly Glu Lys Asn Glu Lys Leu Tyr Cys Ile Leu Thr Asp
1235 1240 1245
Lys Ile Ile Asn Asn Ile Tyr Ser Ile Pro Phe Asn Ser Gln Gln
1250 1255 1260
Lys Ile Leu Glu Asn Gly Tyr Asp Lys Phe Lys Lys Leu Asp Ile
1265 1270 1275
Glu Arg Gln Val Tyr Phe Leu Gln Asn Leu Val Leu Val Leu Lys
1280 1285 1290
Ser Gly Arg Ala Gly Ser Cys Asp Met Ser Ala Ile Gly Gly Ser
1295 1300 1305
Lys Asn Ala Ala Thr Phe Ala Phe Gly Ser Lys Leu Ser Leu Trp
1310 1315 1320
Ala Lys Lys Phe Gln Lys Val Tyr Leu Ile Asp Asn Ser Ser Ser
1325 1330 1335
Gly Ile Tyr Gln Asn Met Ser Asp Asn Leu Leu Asp Ile Ile Lys
1340 1345 1350
<210> 9
<211> 1383
<212> PRT
<213> Lt4Cas9 protein sequence
<400> 9
Met Met Lys Glu Ile Lys Asn Tyr Phe Ile Gly Leu Asp Met Gly Thr
1 5 10 15
Thr Ser Val Gly Trp Ala Ala Thr Asp Glu Asn Tyr Glu Ile Ile Lys
20 25 30
Lys Asn Gly Lys Ala Leu Trp Gly Ile Arg Leu Phe Asp Glu Ala Gln
35 40 45
Thr Ala Ala Asp Arg Arg Met His Arg Ile Ala Arg Arg Arg Ile Glu
50 55 60
Arg Arg Ser Arg Arg Ile Asp Leu Leu Gln Glu Leu Phe Ala Gln Glu
65 70 75 80
Ile Cys Lys Lys Asp Pro Gly Phe Tyr Glu Arg Leu Asn Glu Ser Gly
85 90 95
Leu Tyr Glu Glu Asp Lys Thr Val His Gln Lys Asn Ser Leu Phe Asn
100 105 110
Asp Val Asp Phe Asp Asp Lys Ala Tyr Tyr Lys Glu Tyr Pro Thr Ile
115 120 125
Tyr His Leu Arg Tyr Asp Leu Met Thr Lys Asp Arg Pro Phe Asp Val
130 135 140
Arg Leu Val Tyr Leu Ala Val His His Ile Leu Lys His Arg Gly His
145 150 155 160
Phe Leu Phe Asp His Phe Gln Val Asp Glu Asn Gly Val Ser Gly Phe
165 170 175
Glu Glu Ser Phe Ala Ala Phe Gly Asp Ala Leu Glu His Ile Lys Gly
180 185 190
Glu Ser Phe Asp Met Gly Lys Glu Glu Glu Met Lys Ala Leu Cys Arg
195 200 205
Asp Lys Lys Leu Gly Val Arg His Lys Ala Leu Ala Leu Ala Gln Cys
210 215 220
Leu Gly Arg Ser Lys Asp Lys Asp Phe Lys Ala Met Met Thr Leu Ala
225 230 235 240
Ala Gly Gly Thr Ala Leu Leu Ser Glu Val Phe Lys Asp Glu Gly Leu
245 250 255
Lys Asp Phe Ser Lys Asn Lys Val Ser Phe Ser Asp Ser Gln Phe Glu
260 265 270
Asn Asp Lys Pro Glu Ile Ile Ala Glu Leu Gly Asp Arg Tyr Asp Leu
275 280 285
Ile Ala Ala Leu His Gly Leu Tyr Asn Trp Ser Phe Leu Ala Glu Leu
290 295 300
Met Arg Gly His Lys Tyr Ile Ser Glu Ala Lys Ile Glu Ile Tyr Asp
305 310 315 320
Lys His Lys Glu Asp Leu Ala Leu Leu Lys Lys Val Leu Lys Gln Asp
325 330 335
Arg Ser Val Tyr Asn Leu Met Phe Lys Glu Pro Gly Asp Lys Lys Pro
340 345 350
Ile Asn Tyr Ser Ala Tyr Val Lys Ala Cys Lys Thr Asn Gly Lys Lys
355 360 365
Leu Pro Leu Pro Tyr Gly Lys Phe Lys Tyr Glu Glu Phe Ile Lys Thr
370 375 380
Val Lys Phe Cys Leu Lys Asn Leu Pro Asp Ser Pro Asp Lys Lys Asn
385 390 395 400
Ile Glu Asn Lys Leu Glu Glu Gly Ser Phe Leu Leu Lys Ala Val Ser
405 410 415
Val Glu Asn Gly Ala Ile Pro Tyr Gln Leu His Leu Gln Glu Leu Lys
420 425 430
Ile Ile Leu Ser Lys Ala Glu Ala Tyr Leu Pro Phe Leu Lys Val Arg
435 440 445
Asp Gln Tyr Gly Thr Val Ser Asp Lys Ile Ile Ser Leu Phe Thr Phe
450 455 460
Arg Ile Pro Tyr Tyr Val Gly Pro Ile Asn Glu His Ala Gly Ser Cys
465 470 475 480
Trp Val Val Lys Lys Asp Lys Gln Gly Lys Val Tyr Pro Trp Asn Phe
485 490 495
Thr Glu Lys Ile Asp Ile Glu Lys Ser Ala Glu Gly Phe Ile Arg Asn
500 505 510
Leu Thr Asn Lys Cys Thr Tyr Leu Ile Gly Glu Asp Val Leu Pro Lys
515 520 525
Asn Ser Leu Leu Tyr Ser Glu Phe Thr Val Leu Asn Glu Leu Asn Asn
530 535 540
Val Arg Ile Gly Glu Asn Ala Gln Lys Leu Ser Pro Glu Leu Lys Glu
545 550 555 560
Lys Val Leu Glu Asn Leu Phe Lys Lys His Lys His Val Ser Arg Arg
565 570 575
Lys Phe Ile Asn Tyr Leu Val Thr Glu Gly Ile Asp Lys Lys Glu Ala
580 585 590
Glu Ser Ile Ser Gly Leu Asp Gly Asp Phe Lys Ser Ser Met Ser Ser
595 600 605
Leu Ile Asp Met Lys His Ile Leu Gly Asn Asp Phe Ser Arg Glu Asp
610 615 620
Ala Glu Lys Met Ile Lys Asp Ile Thr Ile Phe Gly Gly Asp Lys Lys
625 630 635 640
Met Leu Lys Lys Arg Leu His Arg Glu Phe Ser Tyr Leu Thr Ser Glu
645 650 655
Gln Leu Thr Ser Leu Thr Arg Leu Ser Tyr Asp Gly Trp Gly Arg Leu
660 665 670
Ser Lys Glu Leu Leu Val Asn Leu Leu Pro Val Glu Lys Ser Thr Gly
675 680 685
Glu Val Leu Val Asp Lys Gly Ser Gly Glu Val Leu Asn Ile Ile Ser
690 695 700
Ala Met Glu Gln Thr Ser Tyr Asn Leu Met Glu Leu Leu Ser Ser Arg
705 710 715 720
Phe Gly Tyr Ala Thr Ala Ile Glu Glu Arg Asn Arg Glu Lys Glu Gly
725 730 735
Asn Gly Thr Ile Ser Tyr Gln Asp Val Glu Asp Met Tyr Ile Ser Pro
740 745 750
Ala Val Lys Arg Pro Leu Trp Gln Ala Leu Lys Ile Val Arg Glu Ile
755 760 765
Val Lys Ile Leu Gly Lys Glu Pro Ser Lys Ile Phe Ile Glu Met Ala
770 775 780
Arg Glu Asn Gly Glu Lys Gly Lys Arg Thr Ile Ser Arg Lys Ala Arg
785 790 795 800
Leu Gln Glu Leu Tyr Lys Lys Cys Arg Asp Asp Ser Arg Asp Trp Ala
805 810 815
Lys Glu Leu Ala Glu Lys Pro Glu Glu Asp Phe Arg Ser Asp Arg Leu
820 825 830
Tyr Leu Tyr Tyr Thr Gln Met Gly Arg Ser Met Tyr Thr Gly Lys Pro
835 840 845
Ile Asp Ile Asn Gln Leu Phe Asp Arg Asn Val Tyr Asp Ile Asp His
850 855 860
Ile Tyr Pro Gln Ser Leu Thr Gly Asp Asp Ser Leu Asp Asn Arg Val
865 870 875 880
Leu Val Glu Lys Thr Val Asn Ala Lys Lys Gly Asp Ile Tyr Pro Leu
885 890 895
Gly Ser Ala Leu Asp Gly Cys His Ile Gln Gly Glu Ile His Ile Gln
900 905 910
Asp Ile Gln Arg Glu Met Arg Pro Phe Trp His Met Leu Leu Glu Lys
915 920 925
Gly Leu Ile Ser Lys Glu Lys Tyr Asn Arg Leu Ser Arg Thr Thr Pro
930 935 940
Leu Ser Asp Thr Glu Lys Ala Ala Phe Ile Gly Arg Gln Leu Val Glu
945 950 955 960
Thr Arg Gln Ser Thr Lys Ala Cys Ala Glu Leu Leu Ser Lys Ala Tyr
965 970 975
Pro Gln Ala Arg Ile Val Tyr Thr Lys Ala Gly Asn Ala Ser Arg Phe
980 985 990
Arg Gln Tyr Gly Gly Phe Ile Lys Val Arg Asp Met Asn Asp Tyr His
995 1000 1005
His Ala Lys Asp Ala Tyr Leu Asn Ile Val Val Gly Asn Val Phe
1010 1015 1020
Asp Thr Arg Phe Thr Ala Asn Pro Leu His Phe Leu Lys Gly Asn
1025 1030 1035
His Pro Val Tyr Ser Leu Asn Thr Glu Ala Leu Tyr Gly His Lys
1040 1045 1050
Val Ser Arg Gly Gly Val Asp Ala Trp Ile Pro Pro Glu Lys Asp
1055 1060 1065
Asp Glu Gly His Ile Met Ala Gly His Glu Gly Thr Met Gly Thr
1070 1075 1080
Val Arg Lys Trp Met Arg Lys Asn Asn Ile Leu Phe Thr Arg Met
1085 1090 1095
Pro Leu Glu Gly Lys Gly Gly Leu Phe Asp Gln Thr Ile Met Lys
1100 1105 1110
Lys Gly Lys Gly Gln Val Pro Leu Lys Gly Asp Ser Pro Val Ser
1115 1120 1125
Asp Ile Glu Lys Tyr Gly Gly Tyr Asn Lys Ala Ser Ser Ala Tyr
1130 1135 1140
Phe Val Leu Thr Ser Ser Lys Leu Lys Asp Glu Thr Ile Tyr Thr
1145 1150 1155
Ile Glu Thr Ile Pro Leu Ile Ile Lys Arg Met Ile Gln Thr Asn
1160 1165 1170
Lys Asp Lys Glu Asp Tyr Ile Lys Arg His Trp Lys Asp His Gly
1175 1180 1185
Lys Lys Met Val Asn Pro His Ile Cys Tyr Gly His Ile Pro Val
1190 1195 1200
Gln Ser Leu Leu Glu Ile Asn Gly Phe Lys Val His Leu Thr Gly
1205 1210 1215
Lys Ser Gly Lys Asp Phe Lys Leu Arg Asn Ala Glu Gln Leu Cys
1220 1225 1230
Ile Ser Asn Asp Asp Ala Ala Val Leu Lys Arg Val Leu Lys Tyr
1235 1240 1245
Asn Glu Arg Ser Ser Leu Ser Lys Gly Lys Glu Ala Leu Leu Ile
1250 1255 1260
Thr Pro Phe Asp Asn Ile Gln Glu Val Asp Leu Asn Arg Leu Tyr
1265 1270 1275
Gln Val Phe Glu Asp Lys Leu Thr Asn Gln Val Tyr Lys Val Lys
1280 1285 1290
Leu Gly Lys Gln Ala Ser Val Leu Lys Lys Gly Glu Asp Lys Phe
1295 1300 1305
Asn Glu Leu Pro Leu Glu Val Lys Cys Arg Val Ile Gly Glu Ile
1310 1315 1320
Leu His Leu Phe Gln Cys Asn Ala Ala Ile Ala Asp Leu Arg Leu
1325 1330 1335
Ile Gly Gly Ala Lys Asn Ala Gly Ala Leu Thr Met Asn Pro Arg
1340 1345 1350
Val Ser Pro Glu Asp His Val Tyr Leu Ile Glu Gln Ser Val Thr
1355 1360 1365
Gly Phe Phe Glu Lys Arg Ile Leu Leu Ala Pro Tyr Gly Gly Lys
1370 1375 1380
<210> 10
<211> 1381
<212> PRT
<213> Lt5Cas9 protein sequence
<400> 10
Met Lys Glu Ile Lys Lys Ile Phe Ile Gly Leu Asp Met Gly Thr Asn
1 5 10 15
Ser Val Gly Trp Thr Ala Thr Asp Glu Asn Tyr Glu Val Ile Lys Lys
20 25 30
Asn Gly Lys Ala Leu Trp Gly Ile Arg Leu Phe Asp Glu Ala Gln Thr
35 40 45
Ala Glu Asp Arg Arg Met His Arg Ile Ala Arg Arg Arg Ile Glu Arg
50 55 60
Arg Ser Arg Arg Ile Asp Leu Leu Gln Glu Leu Phe Ala Gln Glu Ile
65 70 75 80
Cys Lys Lys Asp Pro Gly Phe Tyr Glu Arg Leu Asn Glu Ser Gly Leu
85 90 95
Tyr Glu Glu Asp Lys Thr Val His Gln Thr Asn Ser Leu Phe Asn Asp
100 105 110
Val Asp Phe Asn Asp Lys Ala Tyr Tyr Lys Lys Tyr Pro Thr Ile Tyr
115 120 125
His Leu Arg His Ala Leu Met Thr Glu Asn His Pro Phe Asp Val Arg
130 135 140
Leu Val Tyr Leu Ala Ile His His Ile Leu Lys His Arg Gly His Phe
145 150 155 160
Leu Phe Glu Asn Phe Gln Thr Asp Glu Lys Gly Thr Ser Gly Phe Asp
165 170 175
Glu Ser Phe Ala Ala Phe Gly Ser Ala Leu Asp Arg Ile Lys Gly Ser
180 185 190
Ser Pro Asp Val Arg Lys Ala Asp Ser Met Lys Asp Ile Leu Lys Asp
195 200 205
Lys Lys Leu Gly Val Lys Glu Lys Ala Ala Ser Leu Leu Gln Cys Leu
210 215 220
Gly Gln Gly Lys Glu Lys Asp Phe Lys Ala Met Met Thr Leu Ala Ala
225 230 235 240
Gly Gly Thr Ala Ser Leu Ser Asp Ile Phe Asn Asp Glu Lys Leu Lys
245 250 255
Asp Phe Glu Lys Asn Lys Val Asn Phe Ser Ser Ala Gln Phe Glu Glu
260 265 270
Asn Glu Pro Asp Ile Met Ala Glu Leu Gly Asp Arg Tyr Asp Leu Ile
275 280 285
Ala Ala Leu His Gly Phe Tyr Asn Trp Ser Leu Leu Ala Glu Leu Met
290 295 300
Gly Glu Tyr His Tyr Ile Ser Glu Ala Lys Ile Ala Val Tyr Asp Lys
305 310 315 320
His Lys Ala Asp Leu Lys Val Leu Lys Arg Val Leu Lys Gln Arg Pro
325 330 335
Asp Ile Tyr Ala Lys Ile Phe Arg Glu Pro Gly Ser Ser Ala Asn Lys
340 345 350
Asn Tyr Ser Ala Tyr Val Gly Val Cys Lys Val Lys Gly Lys Lys Ala
355 360 365
Ala Ile Glu Lys Cys Ser Tyr Glu Asp Phe Thr Lys Thr Leu Lys Pro
370 375 380
Cys Leu Lys Asp Met Pro Asp Ser Asn Asp Lys Asp Tyr Ile Ser Arg
385 390 395 400
Glu Leu Asn Met Gly Thr Phe Leu Pro Lys Ser Val Ser Lys Glu Asn
405 410 415
Gly Val Ile Pro Tyr Gln Leu His Leu Gln Glu Leu Lys Ile Ile Leu
420 425 430
Ser Lys Ala Glu Ala Tyr Leu Pro Phe Leu Lys Val Lys Asp Gln Tyr
435 440 445
Gly Thr Val Ser Asp Lys Ile Ile Ser Leu Phe Thr Phe Arg Ile Pro
450 455 460
Tyr Tyr Val Gly Pro Ile Asn Glu His Ala Gly Ser Cys Trp Val Val
465 470 475 480
Lys Lys Asp Lys Arg Gly Lys Val Tyr Pro Trp Asn Phe Thr Glu Lys
485 490 495
Ile Asp Ile Glu Lys Ser Ala Glu Gly Phe Ile Arg Asn Leu Thr Asn
500 505 510
Lys Cys Thr Tyr Leu Ile Gly Glu Asp Val Leu Pro Lys Asn Ser Leu
515 520 525
Leu Tyr Ser Glu Phe Thr Val Leu Asn Glu Leu Asn Asn Val Arg Ile
530 535 540
Gly Glu Thr Met Gln Lys Leu Pro Leu Arg Leu Lys Glu Lys Val Met
545 550 555 560
Asp Asn Leu Phe Ser Arg Tyr Lys His Val Ser Arg Thr Lys Phe Ile
565 570 575
Lys Tyr Leu Val Ser Glu Gly Ile Asp Lys Lys Glu Ala Glu Ser Ile
580 585 590
Ser Gly Leu Asp Gly Asp Phe Lys Ser Ser Leu Ser Ser Leu Ile Asp
595 600 605
Met Lys His Ile Leu Gly Asn Asp Phe Ser Arg Glu Asn Ala Glu Lys
610 615 620
Met Ile Gln Asp Ile Thr Ile Phe Gly Gly Asp Lys Lys Met Leu Lys
625 630 635 640
Asn Arg Leu His Arg Glu Phe Ser Tyr Leu Thr Pro Glu Gln Leu Thr
645 650 655
Ser Leu Thr Gln Leu Ser Tyr Asp Gly Trp Gly Arg Leu Ser Lys Glu
660 665 670
Phe Leu Val Asn Leu Leu Pro Ala Glu Gly Asp Ser Cys Glu Val Leu
675 680 685
Val Asp His Thr Ser Gly Glu Val Leu Asn Ile Ile Ser Ala Met Arg
690 695 700
Gln Thr Ser Tyr Asn Leu Met Glu Leu Leu Gly Ser Arg Phe Gly Tyr
705 710 715 720
Gly Gln Ala Ile Glu Glu Arg Asn Lys Lys Glu Glu Gly Gln Gly Arg
725 730 735
Ile Thr Tyr Lys Asp Val Glu Asp Leu Tyr Ile Ser Pro Ala Val Arg
740 745 750
Arg Pro Leu Trp Gln Ala Leu Lys Ile Val Arg Glu Ile Val Lys Ile
755 760 765
Thr Gly Lys Glu Pro Ser Lys Ile Phe Ile Glu Met Ala Arg Glu Asn
770 775 780
Gly Glu Lys Gly Lys Arg Thr Ile Ser Arg Lys Ala Arg Leu Gln Ala
785 790 795 800
Leu Tyr Lys Lys Cys Arg Asp Asp Thr Arg Asp Trp Ala Lys Glu Leu
805 810 815
Glu Gly Lys Ser Glu Glu Asp Phe Arg Ser Asp Arg Leu Tyr Leu Tyr
820 825 830
Tyr Thr Gln Met Gly Arg Ser Met Tyr Thr Gly Lys Pro Ile Asp Ile
835 840 845
Asn Arg Leu Phe Asp Arg Asn Val Tyr Asp Ile Asp His Ile Tyr Pro
850 855 860
Gln Ser Leu Thr Gly Asp Asp Ser Leu Asp Asn Arg Val Leu Val Glu
865 870 875 880
Lys Thr Val Asn Ala Lys Lys Gly Asp Thr Tyr Pro Leu Ser Ser Ala
885 890 895
Leu Asp Gly Cys Tyr Ile Ser Gly Gln Gln Ile Arg Ile Gln Asp Ile
900 905 910
Gln Lys Glu Met Arg Pro Phe Trp His Met Leu Leu Glu Lys Glu Leu
915 920 925
Ile Ser Lys Glu Lys Tyr Asn Arg Leu Ser Arg Thr Ile Pro Leu Ser
930 935 940
Asp Ala Glu Lys Ala Ala Phe Ile Gly Arg Gln Leu Val Glu Thr Arg
945 950 955 960
Gln Ser Thr Lys Ala Cys Ala Glu Leu Leu Ser Lys Ala Tyr Pro Gln
965 970 975
Thr Arg Ile Val Tyr Thr Lys Ala Gly Asn Ala Ser Arg Phe Arg Gln
980 985 990
Tyr Gly Gly Phe Ile Lys Val Arg Asp Met Asn Asp Tyr His His Ala
995 1000 1005
Lys Asp Ala Tyr Leu Asn Ile Val Val Gly Asn Val Phe Asn Thr
1010 1015 1020
Arg Phe Thr Ala Asn Pro Leu His Phe Leu Lys Gly Asn His Gln
1025 1030 1035
Ala Tyr Ser Leu Asn Thr Glu Ala Leu Tyr Gly His Lys Val Ser
1040 1045 1050
Arg Asn Gly Val Asp Ala Trp Ile Pro Ala Glu Lys Asp Glu Lys
1055 1060 1065
Gly Gln Val Met Ala Gly His Glu Gly Thr Met Gly Thr Val Arg
1070 1075 1080
Lys Trp Met Arg Lys Asn Asn Ile Leu Phe Thr Arg Met Pro Tyr
1085 1090 1095
Glu Gly Lys Gly Gly Leu Phe Asp Gln Asn Ile Met Lys Lys Glu
1100 1105 1110
Lys Gly Gln Val Pro Ile Lys Gly Asp Ser Pro Ile Ser Asn Ile
1115 1120 1125
Lys Lys Tyr Gly Gly Tyr Asn Lys Ala Lys Val Ala Tyr Phe Val
1130 1135 1140
Leu Thr Gln Ser Lys Leu Asn Lys Lys Thr Val Tyr Thr Leu Glu
1145 1150 1155
Ala Ile Pro Leu Ile Leu Lys Asn Ser Ile Gln Ser Asn Glu Asp
1160 1165 1170
Lys Glu Thr Tyr Ile Gln Lys Gln Trp Arg Lys Asn Gly Lys Lys
1175 1180 1185
Met Glu His Pro Ile Val Cys Leu Gly His Ile Pro Val Gln Ser
1190 1195 1200
Leu Leu Glu Ile Asn Gly Phe Lys Val His Leu Ser Gly Lys Asn
1205 1210 1215
Gly Lys Asp Ile Leu Leu Arg Asn Ala Glu Gln Leu Cys Ile Asn
1220 1225 1230
Glu Ala Asp Thr Ala Val Leu Lys Lys Ile Leu Lys Phe Asn Gln
1235 1240 1245
Arg Ala Ala Met Ser Lys Lys Gly Glu Glu Ile Phe Ile Asn Ser
1250 1255 1260
Phe Asp Asn Ile Gln Glu Glu Asp Leu Asn Arg Leu Tyr His Val
1265 1270 1275
Phe Glu Asp Lys Leu Thr Asn Gln Ile Tyr Lys Val Lys Leu Glu
1280 1285 1290
Lys Gln Ala Ala Val Leu Lys Lys Gly Glu Glu Thr Phe Asn Arg
1295 1300 1305
Leu Ser Pro Glu Gln Lys Cys Lys Leu Ile Gly Glu Ile Leu His
1310 1315 1320
Leu Cys Gln Cys Lys Ala Thr His Ala Asp Leu Arg Leu Ile Gly
1325 1330 1335
Gly Ala Lys Lys Ala Gly Ile Leu Thr Met Gly Thr Gln Ile Tyr
1340 1345 1350
Pro Lys Asp His Val Tyr Leu Ile Glu Gln Ser Val Thr Gly Phe
1355 1360 1365
Phe Glu Lys Arg Ile Leu Leu Ala Pro Phe Gly Glu Lys
1370 1375 1380
<210> 11
<211> 1368
<212> PRT
<213> Lt6Cas9 protein sequence
<400> 11
Met Asp Arg Lys Glu Tyr Phe Ile Gly Leu Asp Met Gly Thr Asp Ser
1 5 10 15
Val Gly Trp Ala Val Thr Asp Thr Glu Tyr Lys Val Ile Lys Phe Asn
20 25 30
Gln Lys Ala Leu Trp Gly Val Arg Leu Phe Asp Gly Ala Lys Thr Ala
35 40 45
Ala Glu Arg Arg Gly Phe Arg Thr Ser Arg Arg Arg Val Glu Arg Arg
50 55 60
Asn Gln Arg Leu Asn Trp Leu Gln Asn Val Phe Ser Glu Glu Ile Ala
65 70 75 80
Lys Lys Asp Pro Gly Phe Phe Gln Arg Leu Lys Glu Ser Met Phe Ser
85 90 95
Pro Glu Asp Lys Arg Ser Asp Phe Glu Glu Gln Pro Gly Arg Phe Ser
100 105 110
Leu Phe Asn Asp Ser Asp Tyr Thr Asp Lys Glu Tyr Tyr Lys Asp Tyr
115 120 125
Pro Thr Ile Ser His Leu Lys Val Asp Leu Leu Glu Lys Asp Lys Lys
130 135 140
Phe Asp Ile Arg Leu Ile Tyr Leu Ala Val His His Ile Ile Lys Lys
145 150 155 160
Arg Gly His Phe Leu Phe Asp Asp Ile Ser Ala Asp Asp Glu Ile Thr
165 170 175
Phe Asp Ile Gly Phe Asp Ser Leu Asn Asp Tyr Leu Glu Asp Cys Gly
180 185 190
Phe Glu Arg Ile Thr Leu Asn Asn Lys Asp Asp Phe Thr Lys Thr Leu
195 200 205
Leu Asp Lys Lys Ile Lys Val Lys Asp Lys Thr Lys Leu Leu Leu Glu
210 215 220
His Ala Gly Tyr Ser Lys Lys Asp Lys Gln Lys Glu Ala Leu Phe Thr
225 230 235 240
Leu Ile Ser Gly Gly Ser Gly Asn Ile Ala Asp Leu Tyr Asp Asp Glu
245 250 255
Ser Leu Lys Glu Leu Lys Ser Ile Lys Leu Ala Asn Tyr Glu Asn Phe
260 265 270
Glu Ala Glu Leu Ser Glu Val Leu Gly Asp Asp Phe Glu Leu Ile Asn
275 280 285
Arg Ile Lys Ala Val Tyr Asp Trp Ala Leu Leu Glu Asp Ile Leu Gln
290 295 300
Gly Glu Thr Tyr Ile Ser Thr Ala Arg Lys Lys Ile Tyr Glu Lys His
305 310 315 320
Ala Lys Asp Leu Lys Leu Leu Lys Gln Leu Val Arg Lys Tyr Leu Thr
325 330 335
Ala Glu Asp Tyr Arg Glu Ile Phe Arg Val Cys Ser Thr Lys Ile Asp
340 345 350
Asn Tyr Thr Ala Tyr Ser Gly Asn Tyr Ser Asp Lys Ser Gln Lys Ser
355 360 365
Thr Asp Asp Lys Leu Lys Thr Asn Pro Ser Ala Ser Arg Glu Asp Phe
370 375 380
Tyr Lys Tyr Ile Lys Lys Lys Phe Lys Pro Phe Glu Asp Lys Asp Asp
385 390 395 400
Val Lys Ala Ile Phe Ser Ala Met Glu Glu Asn Asp Phe Leu Pro Lys
405 410 415
Gln Val Gly Ala Glu Asn Gly Leu Ile Pro His Gln Ile Asn Glu Arg
420 425 430
Glu Leu Lys Lys Ile Leu Glu Asn Ser Ser Lys Tyr Tyr Pro Phe Leu
435 440 445
Asn Glu Ile Thr Asp Asp Ser Gly Leu Thr Thr Ser Glu Lys Ile Leu
450 455 460
Gln Val Phe Asn Phe Lys Ile Pro Tyr Tyr Ile Gly Pro Leu Asn Lys
465 470 475 480
Asn Ser Lys Phe Ala Trp Leu Glu Arg Thr Asp Glu Lys Ile Tyr Pro
485 490 495
Trp Asn Phe Ser Asp Val Val Asp Thr Lys Lys Ser Ala Glu Asn Phe
500 505 510
Ile Val Arg Met Thr Ala Lys Cys Thr Tyr Thr Gly Ala Asp Val Leu
515 520 525
Pro Lys Asp Ser Leu Leu Tyr Ser Lys Tyr Met Val Leu Asn Glu Ile
530 535 540
Asn Lys Leu Lys Ile Asn Gly Glu Pro Ile Thr Val Glu Gln Lys Gln
545 550 555 560
Ser Met Tyr Asn Asp Leu Phe Met Arg Tyr Lys Thr Val Lys Thr Lys
565 570 575
Thr Phe Lys Asp Tyr Leu Lys Asn Asn Phe Gly Val Thr Asp Ala Asp
580 585 590
Val Val Ser Gly Ile Asp Met Glu Thr Gly Ile Lys Ala Ser Leu Ser
595 600 605
Ser Tyr His Ala Phe Arg Asn Ile Leu Glu Asn Tyr Lys Asp Glu Glu
610 615 620
Met Val Glu Asp Ile Ile Arg His Ile Val Leu Phe Gly Asp Asp Lys
625 630 635 640
Lys Leu Ile Lys Ser Tyr Ile Ser Glu Lys Tyr Gly His Ile Leu Ser
645 650 655
Glu Ala Asp Ile Lys Tyr Ala Ala Ser Lys Lys Phe Thr Gly Trp Gly
660 665 670
Arg Leu Ser Lys Glu Leu Leu Thr Lys Ile Tyr His Val Asn Arg Glu
675 680 685
Thr Gly Glu Ala Lys Ser Ile Ile Thr Ser Met Trp Glu Asp Asn Lys
690 695 700
Asn Leu Met Glu Leu Leu Ser Cys Glu Tyr Asp Tyr Met Asp Lys Ala
705 710 715 720
Ile Glu Tyr Lys Lys Gln His Met Pro His Gly Ala Asn Ser Val Lys
725 730 735
Asp Phe Ile Glu Glu Ser Tyr Ala Ser Pro Ser Val Lys Arg Ala Ile
740 745 750
Ile Gln Ala Thr Gly Ile Ile Asn Glu Ile Glu Lys Ile Met Lys Ala
755 760 765
Ser Pro Lys Arg Ile Phe Ile Glu Met Ala Arg Glu Lys Glu Asp Pro
770 775 780
Lys Lys Lys Lys Lys Asn Glu Ala Lys Arg Lys Pro Arg Lys Asp Gln
785 790 795 800
Leu Ile Glu Leu Tyr Lys Lys Cys Lys Glu Glu Glu Pro Glu Leu Phe
805 810 815
Ala Ser Leu Gln Asp Thr Ser Glu Glu Lys Leu Arg Lys Asp Ser Leu
820 825 830
Tyr Phe Tyr Tyr Ala Gln Leu Gly Arg Cys Met Tyr Ser Gly Glu Arg
835 840 845
Ile Ser Ile Glu Arg Leu Ser Ser Asp Tyr Asp Ile Asp His Ile Tyr
850 855 860
Pro Arg Ser Lys Thr Lys Asp Asp Ser Leu Asp Asn Arg Val Leu Val
865 870 875 880
Lys Lys Thr Ile Asn Ser Asp Ile Lys Lys Asp Asn Tyr Pro Ile Asp
885 890 895
His Asn Ile Gln Lys Asn Met Lys Asn Phe Trp Asn Glu Leu Glu Lys
900 905 910
Lys Gly Met Ile Ser Lys Glu Lys Tyr Lys Arg Leu Thr Arg Thr Ser
915 920 925
Lys Phe Thr Asp Ser Glu Leu Ala Gly Phe Ile Asn Arg Gln Leu Val
930 935 940
Glu Thr Arg Gln Ser Ser Lys Val Thr Ala Glu Ile Leu Gln Ser Leu
945 950 955 960
Tyr Gly Asp Asn Ser Glu Ile Val Tyr Val Lys Gly Gly Asn Val Ser
965 970 975
Asp Phe Arg Lys Gly Ser Val Asn Lys Lys Thr Gly Glu Val Glu Arg
980 985 990
Lys Ala Phe Val Lys Cys Arg Asp Ile Asn Asp Tyr His His Ala Lys
995 1000 1005
Asp Ala Tyr Leu Asn Ile Val Val Gly Asn Val Tyr His Ile Lys
1010 1015 1020
Phe Thr Lys Ser Pro Val Asn Phe Ile Lys Glu Leu Lys Lys Ser
1025 1030 1035
Asn Gln Thr Tyr Ser Leu Asn Lys Val Phe Asp Tyr Pro Val Glu
1040 1045 1050
Arg Gly Gly Glu Thr Ala Trp Val Pro Gly Lys Asp Gly Thr Phe
1055 1060 1065
Ala Thr Val Glu Arg Gln Met Ser Lys Asn Asn Ile Leu Phe Thr
1070 1075 1080
Arg Leu Ala Tyr Glu Val Lys Gly Gly Phe Tyr Asp Gln Gln Leu
1085 1090 1095
Leu Lys Lys Gly Phe Gly Gln Ala Pro Ile Lys Thr Ser Asp Glu
1100 1105 1110
Arg Phe Asp Leu Ser Arg Lys Lys Tyr Gly Gly Tyr Asn Asn Ser
1115 1120 1125
Tyr Gly Ala Tyr Phe Thr Tyr Val Glu His Asp Asn Lys Lys Lys
1130 1135 1140
Arg Ile Arg Ser Ile Glu Pro Val Phe Ile Leu Asn Lys Asn Val
1145 1150 1155
Phe Glu Ala Asp Pro Val Lys Tyr Cys Thr Glu Val Leu Asn Leu
1160 1165 1170
Lys Asn Pro Val Val Leu Ile Lys Gln Ile Lys Ile Asp Ser Leu
1175 1180 1185
Ile Ser Ile Asp Gly Phe Arg Gly His Ile Ser Gly Arg Thr Gly
1190 1195 1200
Gly Gln Ile Val Leu Lys Asn Ala Asn Gln Leu Leu Leu Glu Asp
1205 1210 1215
Leu Trp His Asp Tyr Val Lys Lys Leu Ser Lys Tyr Leu Asp Arg
1220 1225 1230
Cys Lys Asp Thr Gln Lys Glu Leu Pro Ile Thr Lys Phe Asp Gly
1235 1240 1245
Ile Thr Lys Glu Glu Asn Leu Met Leu Tyr Asp Val Ile Thr Gln
1250 1255 1260
Lys Leu Asn Thr Lys Leu Tyr Glu Ile Lys Tyr Gly Thr Gln Tyr
1265 1270 1275
Asp Thr Leu Lys Asn Asn Arg Asn Lys Phe Asp Gly Leu Ser Ile
1280 1285 1290
His Asp Gln Ala Val Ile Leu Lys Asn Asn Leu Asn Met Leu Lys
1295 1300 1305
Cys Asn Ala Val Asn Ala Asp Phe Thr Leu Leu Thr Gly Ala Gly
1310 1315 1320
Cys Val Gly Arg Ile Lys Ile Thr Lys Asn Ile Ser Pro Gln Ile
1325 1330 1335
Tyr Arg Glu Ile Lys Ile Ile Tyr Gln Ser Ile Thr Gly Val Phe
1340 1345 1350
Glu Lys Ser Val Asp Leu Leu Gly Asp Phe Gly Glu Ser Thr Lys
1355 1360 1365
<210> 12
<211> 1327
<212> PRT
<213> Lt7Cas9 protein sequence
<400> 12
Met Lys Lys Asn Tyr Thr Ile Gly Leu Asp Ile Gly Thr Ala Ser Val
1 5 10 15
Gly Trp Ala Val Leu Thr Glu Asp Tyr Glu Leu Ile Lys Arg Lys Met
20 25 30
Lys Ile Ser Gly Asn Thr Gln Lys Lys Ala Val Lys Lys Asn Phe Trp
35 40 45
Gly Val Arg Leu Phe Glu Gln Gly Glu Thr Ala Glu Gly Arg Arg Leu
50 55 60
Lys Arg Thr Thr Arg Arg Arg Ile Ala Arg Arg Arg Gln Arg Ile Gln
65 70 75 80
Tyr Leu Arg Thr Ile Phe Asp Glu Gly Met Asn Gln Val Asp Ala Asn
85 90 95
Phe Phe Ala Arg Leu Asp Glu Ser Phe Ser Ile Ile Asp Glu Lys Glu
100 105 110
Asn Glu Arg His Pro Ile Phe Gly Asn Val Ala Glu Glu Ala Ala Tyr
115 120 125
His Glu Gln Phe Pro Thr Ile Tyr His Leu Arg Glu His Leu Ala Asn
130 135 140
Ser Ser Glu Gln Ala Asp Leu Arg Leu Val Tyr Leu Ala Leu Ala His
145 150 155 160
Ile Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Glu Leu Asn Thr
165 170 175
Glu Asn Ser Ser Val Ser Gly Thr Phe Glu Gln Phe Ile Lys Val Tyr
180 185 190
Asn Glu Thr Phe Asn Val Glu Lys Ala Leu Asp Leu Thr Val Asp Leu
195 200 205
Asp Gly Ile Ala Glu Gln Lys Ile Ser Arg Met Lys Arg Ala Glu Leu
210 215 220
Ile Leu Ser Leu Phe Pro Glu Glu Lys Ser Thr Gly Asp Phe Ala Gln
225 230 235 240
Phe Ile Lys Met Ile Val Gly Asn Gln Gly Asn Val Lys Lys Thr Phe
245 250 255
Ser Leu Asn Glu Asp Ala Lys Ile Gln Phe Ser Lys Glu Glu Tyr Glu
260 265 270
Glu Asn Leu Glu Thr Leu Leu Ala Glu Ile Gly Glu Asp Phe Arg Ser
275 280 285
Val Phe Asp Ala Ala Lys Ser Val Tyr Asp Ala Ile Ser Leu Ala Asn
290 295 300
Ile Leu Lys Val Thr Asp Ala Thr Thr Arg Ala Lys Leu Ser Ser Ser
305 310 315 320
Met Val Val Arg Phe Thr Asp His Lys Glu Asp Leu Lys Thr Leu Lys
325 330 335
Arg Phe Val Arg Glu Asn Leu Pro Asp Glu Tyr Asp Asp Leu Phe Lys
340 345 350
Asn Lys Lys Val Ala Gly Tyr Ala Gly Tyr Ile Asp Gly Asn Glu Thr
355 360 365
Gln Glu Ala Phe Tyr Lys Tyr Leu Lys Lys Thr Leu Ala Lys Ala Thr
370 375 380
Gly Ala Glu Gly Phe Leu Ala Lys Met Glu Gln Glu Asp Phe Leu Arg
385 390 395 400
Lys Gln Arg Thr Phe Asp Asn Gly Val Ile Pro Tyr Gln Leu His Leu
405 410 415
Glu Glu Leu Lys Ala Ile Ile Lys Asn Gln Lys Ser Tyr Tyr Ser Phe
420 425 430
Leu Asp Glu Glu Lys Ile Ser Gln Leu Met Thr Phe Arg Ile Pro Tyr
435 440 445
Tyr Val Gly Pro Leu Ala Lys Glu Lys Gly Gln Phe Ala Trp Leu Thr
450 455 460
Arg Lys Glu Met Gly Lys Ile Thr Pro Trp Asn Leu Asn Glu Lys Val
465 470 475 480
Asp Ile Glu Lys Ser Ala Thr Asp Phe Val Glu Arg Met Thr Asn Asn
485 490 495
Asp Ser Tyr Leu Pro Met Glu Lys Val Leu Pro Lys His Ser Leu Leu
500 505 510
Tyr Glu Thr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Arg Tyr Ile
515 520 525
Asp Asp Asn Gly Arg Ala Gln Asn Phe Ser Ser Lys Glu Lys Arg Gln
530 535 540
Ile Ile Asn Glu Leu Phe Lys Gln Gln Arg Lys Val Lys Lys Glu Met
545 550 555 560
Leu Glu Ala Phe Leu Lys Asn Glu Tyr Gly Ile Glu Asn Pro Lys Val
565 570 575
Glu Gly Ile Glu Lys Ser Phe Asn Ala Cys Leu Gly Thr Tyr His Asp
580 585 590
Leu Ile Lys Leu Gly Ile Arg Pro Glu Leu Phe Glu Gln Pro Glu His
595 600 605
Glu Gln Gln Phe Glu Gln Ile Val Lys Ile Leu Thr Val Phe Glu Asp
610 615 620
Arg Lys Met Arg Arg Glu Gln Leu Glu Lys Phe Ser Asn Ile Leu Thr
625 630 635 640
Glu Glu Glu Gln Lys Gln Leu Glu Arg Lys His Tyr Lys Gly Trp Gly
645 650 655
Arg Leu Ser Ala Lys Leu Ile His Gly Ile Val Asp Gln Lys Thr Gln
660 665 670
Lys Thr Ile Leu Asp Tyr Leu Ile Asp Asp Asp Asp Leu Pro Lys Asn
675 680 685
Arg Asn Arg Asn Phe Met Gln Leu Ile Asn Asp Glu Asn Leu Ser Phe
690 695 700
Lys Glu Glu Ile Glu Lys Ile Ala Phe Asp Asn Asp Lys Ser Thr Glu
705 710 715 720
Glu Ile Val Gln Glu Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile
725 730 735
Leu Gln Ser Leu Lys Ile Val Glu Glu Ile Ile Asp Ile Met Gly Glu
740 745 750
Leu Pro Thr Asn Ile Val Val Glu Met Ala Arg Glu Asn Gln Thr Thr
755 760 765
Ala Gln Gly Asn Arg Ala Ser Lys Ala Arg Met Lys Tyr Leu Glu Glu
770 775 780
Ser Ile Lys Lys Leu Gly Ser Ser Ile Leu Glu Asp Glu Pro Ile Ser
785 790 795 800
Lys Asp Ala Asn Leu Leu Arg Asn Asp Arg Leu Phe Leu Tyr Tyr Leu
805 810 815
Gln Asn Gly Arg Asp Met Tyr Thr Gly Asn Glu Leu Asp Ile Asn Asn
820 825 830
Leu Ser Ser Tyr Asp Ile Asp His Ile Ile Pro Gln Ser Phe Val Lys
835 840 845
Asp Asp Ser Ile Asp Asn Arg Val Leu Thr Thr Ser Ser Met Asn Arg
850 855 860
Gly Lys Ser Asn Thr Val Pro Ala Glu Ser Val Val Lys Lys Met Arg
865 870 875 880
Pro Thr Trp Glu Arg Leu Leu Ala Ser Gly Leu Ile Ser Lys Lys Lys
885 890 895
Phe Ser Tyr Leu Thr Lys Ala Thr Asn Gly Gly Leu Thr Glu Glu Asp
900 905 910
Lys Ala His Phe Ile Gln Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
915 920 925
Lys Asn Val Ala Gln Ile Leu His Gln Lys Tyr Asn Asn Glu Gln Ser
930 935 940
Ser Glu Lys Pro Val Arg Val Ile Thr Leu Lys Ser Ala Leu Ala Ser
945 950 955 960
Gln Phe Arg Lys Asp Phe Ser Leu Tyr Lys Ile Arg Glu Leu Asn Asp
965 970 975
Tyr His His Ala Gln Asp Ala Tyr Leu Asn Gly Val Ile Ala Gln Ala
980 985 990
Leu Leu Lys Val Tyr Pro Lys Leu Glu Pro Glu Phe Val Tyr Gly Glu
995 1000 1005
Tyr Gln Lys Val Ser Ile Arg Ala Leu Asn Lys Ala Thr Ala Lys
1010 1015 1020
Lys Glu Phe Tyr Ser Asn Ile Met Asn Phe Phe Thr Ser Asp Glu
1025 1030 1035
Met Val Ala Asn Glu Glu Thr Gly Glu Val Leu Trp Asn Arg Gln
1040 1045 1050
Arg Asp Ile Lys Thr Ile Lys Lys Val Met Asn Tyr His Gln Met
1055 1060 1065
Asn Ile Val Lys Lys Val Glu Ile Gln Thr Gly Ala Phe Thr Lys
1070 1075 1080
Glu Ser Ile Leu Pro Lys Gly Pro Ser Lys Lys Leu Ile Ala Arg
1085 1090 1095
Lys Asn Asn Trp Ala Pro Val Asn Tyr Gly Gly Val Asp Ser Pro
1100 1105 1110
Thr Val Ala Tyr Ser Val Ile Ile Thr His Glu Lys Gly Lys Ala
1115 1120 1125
Ala Lys Val Val Gln Gln Leu Val Gly Ile Lys Ile Leu Glu Arg
1130 1135 1140
Gln Ala Phe Glu Gln Asp Glu Val Ala Phe Leu Glu Glu Lys Gly
1145 1150 1155
Phe Ile His Pro Lys Val Gln Leu Lys Leu Pro Lys Tyr Ser Leu
1160 1165 1170
Tyr Gln Phe Ala Asp Gly Arg Arg Arg Leu Leu Ala Ser Ala Glu
1175 1180 1185
Glu Ser Gln Lys Gly Asn Gln Met Val Leu Pro Val His Leu Ile
1190 1195 1200
Glu Leu Leu Tyr His Ala Lys His Val Ser Asp Ser Ser Gly Lys
1205 1210 1215
Ser Leu Glu Tyr Leu Asn Glu His Arg His Glu Phe Ala Glu Leu
1220 1225 1230
Leu Glu Ala Ile Leu Gln Phe Thr Glu Gln Tyr Ile Asp Ala Gly
1235 1240 1245
Lys Asn Gln Lys Lys Val Arg Asp Leu Tyr Glu Lys Asn Gln Asp
1250 1255 1260
Ala Asp Met Arg Glu Leu Ala Ser Ser Phe Ile Gln Leu Leu Gln
1265 1270 1275
Leu Asn Lys Gln Gly Ala Pro Ala Asp Phe Lys Phe Phe Gly Glu
1280 1285 1290
Thr Ile Pro Arg Arg Arg Tyr Lys Asn Thr Ala Glu Ile Val Asp
1295 1300 1305
Ala Thr Phe Ile Asn Gln Ser Ile Thr Gly Leu Tyr Glu Thr Gln
1310 1315 1320
Arg Arg Leu Val
1325
<210> 13
<211> 1296
<212> PRT
<213> LtCpf1 protein sequence
<400> 13
Met Lys Ser Ile Tyr Glu Asp Phe Ile Gly Leu Glu Ser Lys Asn Leu
1 5 10 15
Thr Leu Arg Phe Ala Leu Lys Pro Glu Pro Lys Thr Glu Glu Asn Leu
20 25 30
Lys Gln Tyr Trp Asp Lys Leu Arg Asp Glu Glu Arg Ala Lys Ala Tyr
35 40 45
Pro Ile Val Lys Lys Ile Leu Asp Arg Glu Tyr Gln Arg Leu Ile Ser
50 55 60
Glu Gly Leu Lys Ser Leu Glu Asn Gln Asn Ala Leu Asp Trp Thr Glu
65 70 75 80
Leu Ala Glu Tyr Ile Arg Thr Ser Ser Leu Asn Lys Lys Lys Asn Glu
85 90 95
Glu Lys Arg Leu Arg Lys Leu Ile Ala Gln Ser Leu Lys Ala His Pro
100 105 110
Leu Val Asp Lys Leu Lys Val Lys Asn Ala Phe Gly Lys Asn Gly Tyr
115 120 125
Leu Glu Thr Leu Pro Leu Gly Lys Glu Glu Lys Glu Ala Val Lys Val
130 135 140
Phe Ala Gly Phe Gly Gly Phe Phe Asn Asn Tyr Asn Lys Asn Arg Glu
145 150 155 160
Asn Tyr Phe Ser Thr Glu Glu Lys Ser Thr Ala Ile Ala Asn Arg Ile
165 170 175
Val Asn Glu Asn Phe Ser Lys His Phe Ser Asn Val Glu Ile Val Thr
180 185 190
Lys Ile Gln Lys Glu Val Pro Glu Leu Ile Gln Ile Val Glu Ala Gln
195 200 205
Phe Lys Gly Tyr Asp Ala Ile Phe Thr Val Asn Gly Tyr Asn Met Ala
210 215 220
Leu Ser Gln Ala Gly Ile Asp Thr Tyr Asn Glu Met Val Ala Ile Trp
225 230 235 240
Asn Lys Glu Ala Asn Leu Tyr Ala Gln Lys Ala Gly Lys Leu Pro Asp
245 250 255
Gly His Pro Leu Lys Lys Lys Arg Asn Tyr Leu Leu Ser Ala Leu Phe
260 265 270
Lys Gln Ile Gly Ser Glu Lys Glu His Leu Ile Gln Ile Asp Arg Phe
275 280 285
Asp Gly Asp Glu Glu Val Ile Glu Ala Leu Thr Gly Val Lys Lys Met
290 295 300
Leu Gln Glu Ala Asp Val Phe Glu Lys Leu Asn Met Leu Val Glu Asp
305 310 315 320
Met Glu Asn Trp Asp Tyr Ser Lys Ile Tyr Leu Ser Ala Gln Ser Leu
325 330 335
Ser Asn Val Ser Val Phe Leu Asn Asn Leu Tyr Glu Asp Glu Arg Glu
340 345 350
Asn Ser Trp Asn Tyr Leu Asp Asn Val Leu Arg Glu Lys Trp Gln Ile
355 360 365
Glu Leu Gln Gly Lys Lys Lys Gly Thr Asp Leu Glu Glu Ala Ile Arg
370 375 380
Lys Lys Lys Lys Ser Phe Tyr Ser Ile Ala Glu Leu Gln Glu Thr Val
385 390 395 400
Asn Ala Leu Glu Glu Thr Asp Lys Cys Tyr Ser Val Ser Lys Trp Leu
405 410 415
Leu Glu Ala Leu Lys Ser Glu Thr Val Ile Glu Glu Lys Glu Lys Asp
420 425 430
Ala Glu Asp Phe Cys Thr Lys Trp Lys Thr Glu Arg Asn Pro Leu Lys
435 440 445
Glu Thr Asp Ile Thr Ala Leu Lys Glu Tyr Leu Glu Gln Trp Ile Leu
450 455 460
Leu Ala Arg Tyr Cys Lys Ser Phe Tyr Ala Asn Gly Ile Glu Lys Lys
465 470 475 480
Glu Arg Asp Glu Ala Phe Tyr His Ile Leu Glu Asp Val Leu Tyr Val
485 490 495
Leu Lys Glu Val Ile Tyr Phe Tyr Asn Lys Val Arg Asn Tyr Val Thr
500 505 510
Lys Lys Pro Tyr Ser Leu Glu Lys Ile His Leu Lys Phe Gly His Val
515 520 525
Thr Leu Gly Asn Gly Trp His Ile Asn Gln Glu Lys Asp Asn Gly Thr
530 535 540
Thr Leu Leu Arg Lys Asp Gly Lys Tyr Tyr Leu Ala Ile Thr Asn Ser
545 550 555 560
Leu Asn Lys Lys Ile Cys Ile Pro Ser Gln Ile Glu Gly Thr Gly Asn
565 570 575
Asp Tyr Glu Lys Met Val Leu Asn Ala Phe Lys Lys Asp Lys Ile Tyr
580 585 590
Met Leu Ile Pro Lys Cys Thr Thr Glu Arg Lys Asn Val Glu Ser Cys
595 600 605
Phe Glu Ser Lys Glu Ser Ala Gln Tyr Phe Ile Ile Asp Thr Pro Lys
610 615 620
Phe Val Lys Pro Phe Lys Val Leu Arg Glu Glu Tyr Glu Leu Asn Lys
625 630 635 640
Ile Thr Tyr Asp Gly Val Lys Lys Trp Gln Ser Asp Tyr Leu Lys Lys
645 650 655
Thr Lys Asp Glu Lys Gly Tyr Lys Glu Ala Val Ala Lys Trp Ile Arg
660 665 670
Phe Cys Met Arg Phe Leu Gln Ser Tyr Lys Ser Thr Ala Ile Tyr Asp
675 680 685
Tyr Ser Thr Leu Gln Gln Pro Glu Glu Tyr Glu Thr Val Asp Ser Phe
690 695 700
Tyr Gln Asp Val Gly Lys Ile Thr Tyr Glu Cys His Phe Glu Tyr Val
705 710 715 720
Pro Thr Ser Glu Ile Glu Arg Leu Glu Asn Glu Gly Ser Ile Phe Leu
725 730 735
Phe Gln Ile Tyr Asn Lys Asp Phe Ser Glu Asn Arg Arg Pro Asp Ser
740 745 750
Lys Lys Asn Leu His Thr Leu Tyr Trp Glu Ala Leu Phe Ser Glu Glu
755 760 765
Asn Gln Lys Ala Lys Val Ile Gln Leu Ser Gly Asn Ala Glu Val Phe
770 775 780
Arg Arg Glu Lys Ser Ile Glu Asn Pro Ile Val His Lys Ala Gly Glu
785 790 795 800
Val Leu Val Asn Lys Arg Thr Lys Lys Gly Glu Pro Ile Pro Asp Asp
805 810 815
Ile Tyr Arg Asp Leu Cys Asn Tyr Phe Asn Gly Lys Asp Val Pro Ser
820 825 830
Glu Lys Glu Asp Tyr Lys Glu Tyr Leu Asp Lys Val Tyr Thr Ser Thr
835 840 845
Lys Lys Tyr Asp Ile Thr Lys Asp Lys Arg Phe Thr Glu Asn Lys Tyr
850 855 860
Glu Phe His Val Pro Ile Thr Leu Asn His Gln Ala Glu Gly Val Lys
865 870 875 880
Tyr Leu Asp Gln Lys Ile Leu Arg Met Leu Arg Asp Asn Pro Asp Val
885 890 895
Asn Ile Ile Gly Leu Asp Arg Gly Glu Arg Asn Leu Ile Ser Tyr Val
900 905 910
Val Leu Asn Gln Glu Gly Lys Ile Val Asn Asn Gln Gln Gly Ser Phe
915 920 925
Asn Ile Val Gly Lys Met Asp Tyr Gln Lys Lys Leu Tyr Gln Lys Glu
930 935 940
Lys Asn Arg Asp Lys Glu Arg Lys Thr Trp Lys Asn Ile Glu Thr Ile
945 950 955 960
Lys Asp Leu Lys Glu Gly Tyr Ile Ser Gln Val Val His Glu Leu Thr
965 970 975
Asp Met Ala Ile Arg Asn Asn Ala Ile Ile Val Met Glu Asp Leu Asn
980 985 990
Phe Gly Phe Lys Arg Val Arg Thr Lys Val Glu Arg Gln Val Tyr Gln
995 1000 1005
Lys Phe Glu Leu Ala Leu Leu Lys Lys Leu His Tyr Leu Val Thr
1010 1015 1020
Asp Lys Thr Glu Gly Lys Ala Met Leu Lys Pro Gly Gly Val Leu
1025 1030 1035
Gln Gly Tyr Gln Leu Ala Arg Glu Val Lys Thr Leu Lys Glu Ile
1040 1045 1050
Gly Lys Gln Cys Gly Cys Val Phe Tyr Val Pro Pro Gly Tyr Thr
1055 1060 1065
Ser Lys Ile Asp Pro Thr Thr Gly Phe Val Asp Val Phe Asn Met
1070 1075 1080
Ser Gly Val Thr Asn Arg Glu Lys Lys Lys Ala Phe Phe Glu Lys
1085 1090 1095
Phe Asp Asn Met Phe Tyr Asp Glu Lys Arg Asp Met Phe Gly Phe
1100 1105 1110
Ser Phe Asn Tyr Glu Lys Phe Ala Thr Tyr Gln Ser Ser His Arg
1115 1120 1125
Asn Asp Trp Ile Val Tyr Ser Asn Gly Ser Lys Tyr Val Trp Asn
1130 1135 1140
Ser Leu Asn Lys Thr Asn Glu Leu Ile Asp Val Thr Lys Glu Leu
1145 1150 1155
Lys Met Leu Phe Glu Lys Tyr Ala Ile Asn Tyr Arg Asn Glu Ala
1160 1165 1170
Leu Phe Glu Gln Ile Ile Ser Lys Asp Thr Asp Lys Asn Asn Ala
1175 1180 1185
Asp Phe Trp Asn Lys Leu Phe Trp Tyr Phe Arg Val Leu Leu Arg
1190 1195 1200
Ile Arg Asn Ser Ser Gly Glu Leu Asp Gln Ile Ile Ser Pro Val
1205 1210 1215
Leu Asn Gln Asn Gly Glu Phe Phe Glu Thr Pro Lys Lys Ile Thr
1220 1225 1230
Glu Lys Ser Tyr Leu Ser Asp Tyr Pro Met Asp Ala Asp Thr Asn
1235 1240 1245
Gly Ala Tyr His Ile Ala Leu Lys Gly Leu Tyr Leu Ile Gln Glu
1250 1255 1260
Lys Ile Ala Asp Glu Ser Val Asp Leu Asp Asp Lys Leu Pro Asn
1265 1270 1275
Asp Phe Tyr Lys Ile Ser Asn Ala Glu Trp Phe Arg Phe Arg Gln
1280 1285 1290
Lys Glu Lys
1295
<210> 14
<211> 986
<212> PRT
<213> LtCas13b protein sequence
<400> 14
Met Lys Thr Asn Gln Pro Lys Thr Ile Tyr Tyr Asn Trp Lys Asp Lys
1 5 10 15
Ala Asp Phe Ala Tyr Phe Ala Phe Tyr Thr Ser Gln Ala Leu Asn Asn
20 25 30
Thr Ser Ile Ile Leu Lys Asn Ile Ser Glu Ile Ile Glu Asn Lys Val
35 40 45
Asp Lys Thr Asn Asp Asp Asn Gln Leu Phe Ser Asn Ala Gln Val Ile
50 55 60
Lys Ile Leu Glu Ser Asn Asn Ser Val Ala Gln Lys Ser Val Ile Asn
65 70 75 80
Leu Leu Asp Ser Asn Leu Pro Phe Ser Val Lys Ser Ile Ser Asp Pro
85 90 95
Phe Leu Val Lys Asn Arg Leu Ser Tyr Phe Leu Trp Leu Leu Lys Asn
100 105 110
Phe Arg Asn Glu Tyr Ala His Tyr His Glu Arg Glu Leu Asp Ser Arg
115 120 125
Thr Thr Leu Asp Arg Glu Glu Asp Phe Lys Lys Arg Glu Asn Leu Asn
130 135 140
Lys Arg Phe Ser Lys Tyr Asp Phe Ala Asp Glu Leu Leu Lys Leu Lys
145 150 155 160
Lys Ser Ala Val Glu Glu Leu Glu Lys Arg Leu Lys Ile Asn Asn Lys
165 170 175
Glu Leu Glu Ser Asp Ser Val Tyr Lys Ser Phe Lys Asn Arg Phe Leu
180 185 190
Lys Ile Ser Ser Arg Gln Asn Phe Thr Glu Glu Asp Phe Ile Phe Phe
195 200 205
Leu Cys Leu Phe Leu Ser Thr Lys Glu Thr Met Gln Leu Leu Asn Gly
210 215 220
Ile Lys Gly Lys Lys Asn Thr Thr Thr Glu Glu Phe Gln Trp Ile Arg
225 230 235 240
Arg Val Phe Thr Ile Phe Asn Ala Lys Arg Phe Gln Asn Lys Ile Lys
245 250 255
Ser Asp Asn Pro Lys Glu Ala Phe Ile Leu Asn Ile Val Asn Asp Leu
260 265 270
Ala Lys Ile Pro Ile His Leu Lys Lys Tyr Leu Thr Glu Glu Ala Lys
275 280 285
Glu Lys Leu Ile Tyr Thr Val Gln Glu Thr Glu Asp Glu Glu Gly Asn
290 295 300
Ile Leu Glu Gln Lys Ser Glu Ala Val Thr Lys His Asp Lys Met Phe
305 310 315 320
Ser Tyr Arg Cys Leu Gln Tyr Leu Glu Leu Phe Gly Phe Ala Lys Asn
325 330 335
Ile Asn Phe Asn Ile Asn Leu Gly Lys Val Phe Ile Asn Lys Pro Tyr
340 345 350
Lys Lys Thr Ile Ile Asn Asp Glu Tyr Asp Arg Phe Leu Asp Lys Glu
355 360 365
Ile His Thr Phe Gly Lys Leu Lys Asp Phe Asp Asp Ser Tyr Phe Glu
370 375 380
Gly Tyr Ile Glu Arg Lys Glu Thr Glu Asn Gly Val Val Thr Thr Phe
385 390 395 400
Lys Gln Pro Leu Lys Phe Tyr Ser Pro Lys Tyr His Phe Ser Asn Asn
405 410 415
Arg Ile Gly Ile Lys Leu Tyr Asn Lys Asn Leu Lys Leu Glu Leu Lys
420 425 430
Glu Tyr Asp Glu Asp Gly Lys Phe Val Val Ile Asn Asn Gln Pro Asp
435 440 445
Tyr Phe Leu Ser Glu Asn Ala Ile Pro Tyr Phe Thr Tyr Cys Met Ile
450 455 460
Asn Phe Gly Glu Asp Lys Thr Leu Gly Val Ile Lys Asn Phe Glu Thr
465 470 475 480
Asn Phe Lys Arg Phe Leu Asn Asp Val Ser Ile Gly Lys Ser Ile Lys
485 490 495
Val Gln Glu Ile Glu Gln Asn Tyr Ser Leu Lys Ile Gly Trp Ile Pro
500 505 510
Ser Leu Ile Arg Asp Asn Leu Phe His Asp Asn Asp Lys Thr Phe Glu
515 520 525
Asp Val Val Lys Glu Lys Ile Asn Ser Leu Lys Glu Glu Thr Gln Lys
530 535 540
Leu Ile Glu Asn Asn Asn Lys Lys Pro Glu Asp Arg Asp Arg Asn Ile
545 550 555 560
Lys Phe Ser Phe Lys Lys Gly Asp Leu Ala Thr Phe Val Ala Lys Asp
565 570 575
Ile Ile Tyr Phe Met Glu Leu Lys Glu Glu Ile Val Asn Gly Lys Lys
580 585 590
Val Val Ser Lys Leu Ser Ser Ile Glu Tyr Asp Val Leu Gln Ser Lys
595 600 605
Leu Ala Phe Tyr Gly Lys His Glu Glu Asp Leu Lys Leu Leu Phe Lys
610 615 620
Lys Trp Asn Leu Asn Glu Arg His Pro Phe Leu Lys Asp Val Ser Met
625 630 635 640
Glu Lys Val Glu Asp Arg Arg Gly Phe Lys Arg Gln Ile Gly Ile Lys
645 650 655
Lys Phe Phe Lys Asn Tyr Ile Tyr Gln Arg Lys Tyr Trp Leu Asn Asp
660 665 670
Leu Lys Leu Glu Ile Asn Lys Glu Asn Tyr His Phe Val Asp Asp Tyr
675 680 685
Lys Val Cys Lys Asn Asp Thr Gln Ile Lys Ala Tyr Ala Ser Asn Leu
690 695 700
Leu Asn His Thr Ile Tyr Leu Pro Asn Asp Leu Phe Ser Asp Leu Ile
705 710 715 720
Leu Glu Asn Ser Asn Phe Ala Val Glu Lys Ala Asn Thr Asn Phe Leu
725 730 735
Ile Ser Lys Asn Leu Glu Tyr Phe Gly Asn Gln Trp Phe Tyr Asn Lys
740 745 750
Glu Asn Tyr Gln Gln Gly Leu Asp Thr Tyr Gln Ser His Leu Arg Asp
755 760 765
Lys Gln Ile Arg Lys Ala Ile Thr Leu Asp Arg Leu Tyr Trp Asn Met
770 775 780
Leu Lys Ile Asn Thr Gln Ile Pro Glu Phe Ser Gln Ile Phe Glu Gln
785 790 795 800
Asn Asn Leu Ala Ala Tyr Asn Ser Glu Gln Thr Leu Leu Glu Lys Gln
805 810 815
Ile Arg Met Ser Ser Asn Phe Lys Leu Ala Lys Glu Asp Phe Arg Tyr
820 825 830
Leu Asp Phe Asp Arg Asp Phe Glu Ile Lys Ile Glu Gly Tyr Arg Lys
835 840 845
Ile Lys Asp Phe Gly Ile Phe Arg His Leu Ile Lys Asp Arg Arg Val
850 855 860
Pro Ser Ile Leu Ala Phe Tyr Thr Lys Tyr Lys Lys Leu Gly Ser Ile
865 870 875 880
Asn Glu Glu Ile Ile Arg Asn Glu Ile Leu Asp Phe Glu Tyr Tyr Lys
885 890 895
Ile Ser Ile Leu Lys Arg Val Leu Glu Ile Asp Lys Gln Ile Tyr Asn
900 905 910
Asn Leu Ile Gly Lys Asn Ile Ser Ile His Glu Lys Phe Ser Glu Asn
915 920 925
Val Asn Lys Leu Tyr Leu Glu Asn Gln Lys Leu Ala Asn Lys Ile Ile
930 935 940
Glu Ile Arg Asn Lys Leu Leu His Asn Lys Val Pro Glu Ile Asn Leu
945 950 955 960
Glu Asn Ile Gln Lys Thr Phe Ser Glu Thr Leu Phe Leu Glu Met Glu
965 970 975
Lys Ser Cys Asn Glu Leu Glu Arg Leu Ile
980 985
<210> 15
<211> 957
<212> PRT
<213> LtCas13d protein sequence
<400> 15
Met Ile Leu Ile Leu Gly Glu Gly Thr Ile Arg Met Ala Lys Lys Lys
1 5 10 15
Asn Ala Arg Gln Arg Arg Glu Glu Glu Lys Asn Arg Ile Lys Ala Ile
20 25 30
Ile Glu Lys Ile Lys Asn Lys Val Val Glu Lys Glu Glu Thr Glu Glu
35 40 45
Ile Val Glu Asn Asn Glu Thr Lys Asn Val Glu Ser Ile Val Val Glu
50 55 60
Pro Lys Lys Lys Ser Leu Ala Lys Ala Ser Gly Val Lys Ser Val Phe
65 70 75 80
Ile Asn Asn Asp Glu Ile Ile Met Thr Ser Phe Gly Arg Gly Asn Asp
85 90 95
Ala Val Ile Glu Lys Ile Ile Lys Asp Asn Asn Ile Asp Asn Glu Asn
100 105 110
Lys Asp Lys Pro Val Tyr Asp Val Val Ala Ile Glu Asn Glu Gly Asn
115 120 125
Ile Ile Lys Val Gln Ser Glu Arg Phe Lys Ala Ile Glu Ser Ala Asn
130 135 140
Thr Glu Ile Pro Pro Glu Arg Asn Gly Met Asp Leu Ile Lys Arg Lys
145 150 155 160
Asp Lys Leu Glu Glu Val Tyr Phe Gly His Thr Phe Asn Asp Asn Ile
165 170 175
His Ile Gln Leu Ile Tyr Asn Ile Leu Asp Ile Glu Lys Ile Leu Ser
180 185 190
Val Tyr Ile Asn Asn Ile Val Tyr Ala Leu Gly Asn Leu Glu Arg Lys
195 200 205
Asp Thr Asp Glu Glu Lys Asp Leu Ile Gly Tyr Ser Ser Ala Arg Ala
210 215 220
Lys Tyr Glu Asp Phe Ile Glu Asn Glu Lys Leu Glu Asp Arg Lys Lys
225 230 235 240
Leu Leu Glu Glu Phe Ile Glu Asn Gly Asp Arg Leu Gly Tyr Phe Gly
245 250 255
Asn Val Phe Phe Lys Asn Asp Lys Glu Leu Lys Ser Lys Lys Glu Ile
260 265 270
Tyr Asn Ile Leu Gly Leu Leu Gly Ser Leu Arg Gln Phe Cys Phe His
275 280 285
Tyr Asn Glu Ala Val Phe Glu Asn Glu Glu Gly Lys Ile Asn Gln Glu
290 295 300
Tyr Lys Ser Asn Ser Trp Leu Tyr Asn Leu Gly Gln Leu Phe Asp Glu
305 310 315 320
Phe Lys Asp Thr Leu Asn Gly Phe Tyr Asn Glu Lys Ile Asp Ser Ile
325 330 335
Asn Lys Asp Phe Ile Lys Thr Asn Gln Ile Asn Leu His Ile Ile Cys
340 345 350
Ser Glu Leu Gly Met Asn Met Asp Lys Glu Gln Val Val Gly Asp Tyr
355 360 365
Tyr Asp Phe Ile Ile Ser Lys Lys His Lys Asn Met Gly Phe Ser Ile
370 375 380
Lys Lys Ile Arg Glu Tyr Met Phe Asp Ile Tyr Glu Ala Phe Asp Ile
385 390 395 400
Lys Asp Lys Glu Phe Asp Ser Val Arg Ser Ile Leu Tyr Lys Ile Ile
405 410 415
Asp Phe Ile Ile Tyr Tyr Ser Phe Ile His Tyr Lys Asn Asp Ile Ala
420 425 430
Glu Asn Ile Val Ser Arg Leu Arg Val Ser Leu Ser Glu Glu Asp Lys
435 440 445
Asp Lys Val Tyr Glu Glu Ile Ala Arg Asp Thr Trp Asn Glu Tyr Lys
450 455 460
Asp Gln Ile Asn Lys Leu Lys Glu Leu Leu Thr Lys Arg Ile Gly Glu
465 470 475 480
Phe Ser Asp Ala Lys Asn Lys Asn Val Tyr Tyr Lys Glu Phe Glu Ser
485 490 495
Ile Lys Phe Asp Glu Ile Gly Lys Lys Lys Leu Gly Glu Asn Ala Asp
500 505 510
Tyr Phe Cys Lys Leu Met Tyr Leu Leu Thr Leu Phe Leu Asp Gly Lys
515 520 525
Glu Ile Asn Asp Leu Leu Thr Thr Leu Ile Asn Lys Phe Asp Asn Ile
530 535 540
Arg Ser Phe Ile Glu Ile Met Glu Glu Lys Gln Ile Glu Cys Asn Phe
545 550 555 560
Asp Glu Lys Phe Ser Phe Phe Asp Glu Ser Lys Asn Val Cys Asp Thr
565 570 575
Leu Arg Glu Val Asn Ser Phe Ala Arg Met Gln Arg Pro Leu Asp Asn
580 585 590
Lys Ser Val Gln Arg Glu Met Tyr Arg Asp Ala Ile Lys Ile Leu Leu
595 600 605
Lys Asp Thr Trp Val Glu Glu Lys Asn Ile Asp Arg Ile Leu Asp Glu
610 615 620
Tyr Ile Pro Asn Lys Glu Asn Lys Ser Ile Lys Lys Asp Phe Arg Asn
625 630 635 640
Phe Ile Ile Lys Asn Ile Ile Lys Ser Asn Arg Phe Ile Tyr Leu Ile
645 650 655
Lys Tyr Ser Asn Pro Thr Asp Val Arg Lys Leu Ala Ser Asn Lys Asp
660 665 670
Val Val Lys Phe Val Leu Asn Thr Ile Pro Glu Ala Gln Ile Asp Arg
675 680 685
Tyr Tyr Asn Ser Cys Gly Leu Pro Leu Glu Glu Asp Asn Asn Val Gln
690 695 700
Ile Glu Lys Leu Ser Glu Ile Ile Thr Asn Ile Asp Tyr Ile Glu Phe
705 710 715 720
Leu Asp Val Gln Gln Ser Tyr Lys Asn Glu Asp Lys Ser Gln Lys Gln
725 730 735
Ala Val Val Thr Leu Tyr Leu Thr Ile Leu Tyr Ile Leu Thr Lys Asn
740 745 750
Leu Val Asn Val Asn Ser Arg Tyr Val Ile Ala Leu His Cys Leu Glu
755 760 765
Arg Asp Ser Thr Leu Tyr Gly Ile Lys Leu Lys Lys Glu Lys Asn Lys
770 775 780
Pro Ser Lys Tyr His Lys Leu Thr Gln Tyr Phe Ile Asp Asn Arg Tyr
785 790 795 800
Phe Asp Arg Lys Lys Lys Asp Arg Lys Asn Gly Glu Tyr Val Ser Lys
805 810 815
Lys Ile Ser Gly Tyr Ile Glu Lys Asn Met Lys Asn Tyr Ile Glu Cys
820 825 830
Glu Gln Ile Glu Thr Thr Glu Gln Tyr Lys Glu Thr Gly Val Asp Met
835 840 845
Phe Ile Asn Tyr Arg Asn Ser Ile Ala His Leu Asn Thr Val Arg Lys
850 855 860
Ala Ser Lys Tyr Ile Lys Asp Ile Lys Tyr Phe Gly Thr Tyr Phe Glu
865 870 875 880
Leu Tyr His Tyr Ile Met Gln Arg Tyr Leu Lys Asp Asn Ile Glu Leu
885 890 895
Lys Gly Glu Asn Asn Ala Leu Glu Gly Tyr Phe Asp Asn Leu Cys Lys
900 905 910
Tyr Gly Thr Tyr Val Lys Asp Phe Val Lys Thr Leu Asn Val Pro Phe
915 920 925
Ala Tyr Asn Tyr Pro Arg Tyr Lys Asn Leu Ser Ile Asp Glu Leu Phe
930 935 940
Asp Lys Asn Asn Thr Arg Lys Thr Lys Lys Ser Ser Leu
945 950 955

Claims (10)

1. A construction method of a homologous type 2 CRISPR/Cas gene editing system is characterized by comprising the following steps:
(1) processing the metagenome sequencing data, and screening to obtain a contig sequence and a translated protein sequence;
(2) clustering the protein sequences in the step (1) by using software, and then expanding the clustering by using an HMMer software package;
(3) searching for clusters related to the type 2 CRISPR/Cas system in the expanded clusters of step (2) using a known hidden markov model file of type 2 CRISPR/Cas system-related effector proteins;
(4) comparing the clusters related to the type 2 CRISPR/Cas system in the step (3) with a related database, and screening and filtering to obtain a new gene;
(5) predicting a contig sequence where the gene obtained by screening in the step (4) is located by using a CRISPR/Cas system, extracting predicted effector protein of the type 2 CRISPR/Cas system, and analyzing a structural domain by using software;
(6) and (4) performing comparative genomics and evolution correlation analysis on the structural domain of the effector protein of the type 2 CRISPR/Cas system predicted in the step (5) and the auxiliary protein related to the effector protein of the type 2 CRISPR/Cas system to obtain the novel homologous type 2 CRISPR/Cas gene editing system.
2. The constructing method according to claim 1, wherein in the step (1), the processing of the metagenomic sequencing data specifically comprises: performing quality control, splicing and gene prediction on the data of the metagenome; the metagenome sequencing data is intestinal metagenome sequencing data; the length of the contig sequence is more than or equal to 4000, and the length of the protein sequence is more than or equal to 600.
3. The construction method according to claim 1, wherein the step (2) is specifically: the protein sequences were clustered using orthofinder software and then expanded using the HMMer software package.
4. The constructing method as claimed in claim 1, wherein in the step (4), the relational databases are a Swiss prot database and a NCBI nr database.
5. The method of constructing according to claim 1, wherein in step (5), the software for CRISPR/Cas system prediction is CRISPRACASFinder, and the software for analyzing the domain is HHpred.
6. A homologous type 2 CRISPR/Cas gene editing system constructed by the construction method of any one of claims 1-5.
7. The cognate type 2 CRISPR/Cas gene editing system of claim 6, wherein the cognate type 2 CRISPR/Cas gene editing system comprises effector Cas ribonucleoprotein complexes (Cas9, Cas12 and Cas13) and CRISPR RNA or transactivation CRISPR RNA.
8. The homologous type 2 CRISPR/Cas gene editing system of claim 7, further comprising an accessory protein Cas1, an accessory protein Cas2, an accessory protein Csn, an accessory protein Cas4, an accessory protein csx27, an accessory protein csx28 and an accessory protein WYL.
9. The homologous type 2 CRISPR/Cas gene editing system of claim 8, wherein the chain length of the effector Cas ribonucleoprotein complex is 900-1600aa, the length of CRISPR RNA is 15-36bp, the chain length of the trans-activated CRISPR RNA complex is 70-160bp, and the length of the PAM sequence recognized by the effector Cas9/Cas12 ribonucleoprotein complex is 1-10 bp.
10. The homologous type 2 CRISPR/Cas gene editing system of claim 9, wherein the homologous type 2 CRISPR/Cas gene editing system has 3 major effect proteins, which are effect Cas9 protein, effect Cas12 protein, and effect Cas13 protein, respectively;
there are 12 effector Cas9 proteins numbered C1556, C1793, C1807, C4640, C6165, Lt1, Lt2, Lt3, Lt4, Lt5, Lt6, and Lt 7; the effect Cpf1 protein has 1, and is numbered as LtCpf1, and the effect Cas13 protein has 2, and is numbered as LtCas13b and LtCas13 d; wherein, in the homologous type 2 CRISPR/Cas9 gene editing system with numbers C1556, C1793, C1807, C4640 and C6165: the chain lengths of the Cas9 ribonucleoprotein complex are 1144aa, 1358aa, 1426aa, 1315aa and 1152aa respectively, the chain lengths of the complex of CRISPR RNA and the transactivation CRISPR RNA are 126bp, 124bp, 141bp, 152bp and 120bp respectively, and the PAM sequences recognized by the Cas9 ribonucleoprotein complex are W (A > T) TNTAH (A > T > C) NNAT, NATS (G > C) NY (C > T) GAT, NNTA, NNCGC and CGNGAGG respectively.
CN202110589533.8A 2020-09-11 2021-05-28 Construction method of homologous type 2 CRISPR/Cas gene editing system Pending CN113851186A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010952121 2020-09-11
CN2020109521211 2020-09-11

Publications (1)

Publication Number Publication Date
CN113851186A true CN113851186A (en) 2021-12-28

Family

ID=74320957

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202011311024.0A Withdrawn CN112331264A (en) 2020-09-11 2020-11-20 Construction method of homologous type 2 CRISPR/Cas gene editing system
CN202110589533.8A Pending CN113851186A (en) 2020-09-11 2021-05-28 Construction method of homologous type 2 CRISPR/Cas gene editing system

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN202011311024.0A Withdrawn CN112331264A (en) 2020-09-11 2020-11-20 Construction method of homologous type 2 CRISPR/Cas gene editing system

Country Status (2)

Country Link
CN (2) CN112331264A (en)
WO (1) WO2022052211A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114075559A (en) * 2020-09-14 2022-02-22 珠海舒桐医疗科技有限公司 Type 2 CRISPR/Cas9 gene editing system and application thereof
EP4230734A1 (en) * 2022-02-21 2023-08-23 Zhuhai Shu Tong Medical Technology Co., Ltd. Type ii crispr/cas9 genome editing system and the application thereof

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022052211A1 (en) * 2020-09-11 2022-03-17 中山大学附属第一医院 Homologous type 2 crispr/cas9 gene editing system and construction method therefor
CN113234701B (en) * 2020-10-20 2022-08-16 珠海舒桐医疗科技有限公司 Cpf1 protein and gene editing system
CN113234702B (en) * 2021-03-26 2023-02-10 珠海舒桐医疗科技有限公司 Lt1Cas13d protein and gene editing system
CN116751763B (en) * 2023-05-08 2024-02-13 珠海舒桐医疗科技有限公司 Cpf1 protein, V-type gene editing system and application
CN117448300B (en) * 2023-05-08 2024-04-30 珠海舒桐医疗科技有限公司 Cas9 protein, type II CRISPR/Cas9 gene editing system and application
CN117866926A (en) * 2024-03-07 2024-04-12 珠海舒桐医疗科技有限公司 CRISPR-FrCas9 protein mutant and application thereof

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018035250A1 (en) * 2016-08-17 2018-02-22 The Broad Institute, Inc. Methods for identifying class 2 crispr-cas systems

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022052211A1 (en) * 2020-09-11 2022-03-17 中山大学附属第一医院 Homologous type 2 crispr/cas9 gene editing system and construction method therefor

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018035250A1 (en) * 2016-08-17 2018-02-22 The Broad Institute, Inc. Methods for identifying class 2 crispr-cas systems

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DAVID BURSTEIN等: "New CRISPR–Cas systems from uncultivated microbes", 《NATURE》, vol. 542, no. 7640, 12 December 2016 (2016-12-12), pages 237 - 240, XP037520544, DOI: 10.1038/nature21059 *
唐连超等: "CRISPR-Cas 基因编辑系统升级:聚焦Cas 蛋白和PAM", 《遗传》, vol. 42, no. 3, 6 March 2020 (2020-03-06) *
方静;侯佳林;张宇;王风平;何莹;: "产甲烷古菌中CRISPR簇的研究", 微生物学通报, vol. 43, no. 11, 16 June 2016 (2016-06-16) *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114075559A (en) * 2020-09-14 2022-02-22 珠海舒桐医疗科技有限公司 Type 2 CRISPR/Cas9 gene editing system and application thereof
CN114075559B (en) * 2020-09-14 2023-11-17 珠海舒桐医疗科技有限公司 2-type CRISPR/Cas9 gene editing system and application thereof
EP4230734A1 (en) * 2022-02-21 2023-08-23 Zhuhai Shu Tong Medical Technology Co., Ltd. Type ii crispr/cas9 genome editing system and the application thereof

Also Published As

Publication number Publication date
WO2022052211A1 (en) 2022-03-17
CN112331264A (en) 2021-02-05

Similar Documents

Publication Publication Date Title
CN112331264A (en) Construction method of homologous type 2 CRISPR/Cas gene editing system
Li et al. Advances in detecting and reducing off-target effects generated by CRISPR-mediated genome editing
CN107109486B (en) Method for detecting off-target sites of genetic scissors in genome
US10604799B2 (en) Sequence assembly
CN103088433B (en) Construction method and application of genome-wide methylation high-throughput sequencing library and
CN103233072B (en) High-flux mythelation detection technology for DNA (deoxyribonucleic acid) of complete genome
EP2917881A1 (en) Validation of genetic tests
Slesarev et al. CRISPR/Cas9 targeted CAPTURE of mammalian genomic regions for characterization by NGS
CN113234701B (en) Cpf1 protein and gene editing system
Akintunde et al. The evolution of next-generation sequencing technologies
CN112430586B (en) VI-B type CRISPR/Cas13 gene editing system and application thereof
BR0210588A (en) Genetic mapping technology
WO2023207686A1 (en) Gene editing result prediction method and apparatus, electronic device, program and medium
JP3431135B2 (en) Gene affinity search method and gene affinity search system
Wang et al. RestrictionDigest: A powerful Perl module for simulating genomic restriction digests
Liang et al. BS-RNA: an efficient mapping and annotation tool for RNA bisulfite sequencing data
Yang et al. CloG: a pipeline for closing gaps in a draft assembly using short reads
Serrania et al. Massive parallel insertion site sequencing of an arrayed Sinorhizobium meliloti signature-tagged mini-Tn 5 transposon mutant library
CN114293264A (en) Preparation method of enzyme method target sequence random sgRNA library
Dampier et al. CRSeek: a Python module for facilitating complicated CRISPR design strategies
Lapidus Genome sequence databases (overview): sequencing and assembly
Feschotte et al. Computational analysis and paleogenomics of interspersed repeats in eukaryotes
Deo et al. Oral microbiome research–A Beginner's glossary
Rigou et al. Pithoviruses are invaded by repeats that contribute to their evolution and divergence from cedratviruses
CN117448300B (en) Cas9 protein, type II CRISPR/Cas9 gene editing system and application

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20221118

Address after: A2301J, No. 360, Hengtong Road, Jing'an District, Shanghai, 200000

Applicant after: Little Skylark Health Management Co.,Ltd.

Address before: 510000 No. 135 West Xingang Road, Guangdong, Guangzhou

Applicant before: SUN YAT-SEN University

Applicant before: THE FIRST AFFILIATED HOSPITAL OF SUN YAT-SEN University

TA01 Transfer of patent application right