WO2013169802A1

WO2013169802A1 - Methods and compositions for nuclease-mediated targeted integration of transgenes

Info

Publication number: WO2013169802A1
Application number: PCT/US2013/039979
Authority: WO
Inventors: Gregory J. Cost; Fyodor Urnov; William M. Ainley; Joseph F. Petolino; Jayakumar Pon Samuel; Steven R. Webb; Lakshmi SASTRY-DENT
Original assignee: Sangamo Biosciences, Inc.; Dow Agrosciences Llc
Priority date: 2012-05-07
Filing date: 2013-05-07
Publication date: 2013-11-14
Also published as: KR102116153B1; AU2013259647A1; KR20150006469A; JP6559063B2; RU2650819C2; US10174331B2; CA2871524C; CA2871524A1; IL235421B; RU2014149120A; JP2015516162A; AU2013259647B2; BR112014027813A2; CN104471067B; IL235421A0; EP2847338B1; JP2019088321A; EP2847338A4; HK1208051A1; EP2847338A1

Abstract

Disclosed herein are methods and compositions for homology-independent targeted insertion of donor molecules into the genome of a cell.

Description

METHODS AND COMPOSITIONS FOR N U C LE AS E M EDIATED

TARGETED INTEGRATION OF TRANSGENES CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] The present application claims the benefit of U.S. Provisional

Application No. 61 /643,812, filed May 7, 2012, the disclosure of which is hereby incorporated by reference in its entirety. STATEMENT OF RIGHTS TO INVENTIONS

MADE UNDER FEDERALLY SPONSORED RESEARCH

[0002] Not applicable.

TECHNICAL FIELD

[0003] The present disclosure is in the field of genome engineering, particularly targeted modification of the genome of a cell.

BACKGROUND

[0004] Integration of foreign DNA into the genome of organisms and cell lines is a widely utilized method for interrogation and manipulation of biological systems. Traditionally, transgene insertion is targeted to a specific locus by provision of a plasmid carrying a transgene, and containing substantial DNA sequence identity flanking the desired site of integration. Spontaneous breakage of the chromosome followed by repair using the homologous region of the plasmid DNA as a template results in the transfer of the intervening transgene into the genome. See, e.g., Roller et al. (1989) Proc. Nat l Acad. Sci. USA 86(22):8927-8931 ; Thomas et al. (1986) Cell 44(3) :419-428. The frequency of this type of homology-directed targeted integration can be increased by up to a factor of 10⁵ by deliberate creation of a double-strand break in the vicinity of the target region (Hockemeyer et al. (2009) Nature Biotech. 27(9):851-857; Lombardo et al. (2007) Nature Biotech. 25(11): 1298- 1306; Moehle et al. (2007) Proc. Nat 'l Acad. Sci. USA 104(9):3055-3060; Rouet et al. (1994) Proc. Nat 'l Acad. Sci. USA 91(13):6064-6068. [0005] A double-strand break (DSB) or nick for can be created by a site- specific nuclease such as a zinc-finger nuclease (ZFN) or TAL effector domain nuclease (TALEN), or using the CRISPR/Cas9 system with an engineered crKNA/tract R A (single guide RNA) to guide specific cleavage. See, for example, Burgess (2013) Nature Reviews Genetics 14:80-81, Urnov et al. (2010) Nature 435(7042):646-51 ; United States Patent Publications 20030232410; 20050208489; 20050026157;

20050064474; 20060188987; 20090263900; 20090117617; 20100047805;

201 10207221 ; 20110301073 and International Publication WO 2007/014275, the disclosures of which are incorporated by reference in their entireties for all purposes. In many organisms, transgene insertion can be accomplished via homology-directed repair (HDR) processes, which require the inserted transgene to include regions of homology to the site of insertion (cleavage). However, some organisms and cell lines lack traditional HDR process and targeted integration occurs primarily via the homology-independent non-homologous end joining (NHEJ) DNA repair machinery. As such, to date, in organisms and cell lines (e.g., CHO cells) that are recalcitrant to HDR processes, only relatively short (<100 bp) oligonucleotides have been integrated via homology-independent pathways following nuclease-mediated cleavage of the target locus. See, e.g., Orlando et al. (2010) Nucleic Acids Res. 38(15):el52 and U.S. Patent Publication No. 20110207221.

[0006] Thus, there remains a need for compositions and methods for homology-independent targeted integration of transgenes, including larger transgenes, directly into the site of cleavage, for example in organisms and cell lines that lack, or are deficient in, traditional homology-driven approaches.

SUMMARY

[0007] Disclosed herein are methods and compositions for homology- independent targeted integration of a transgene.

[0008] In one aspect, described herein are double-stranded donor

polynucleotides for integration into an endogenous locus of choice following in vivo cleavage of the donor using at least one nuclease. The donor polynucleotides include an exogenous sequence (transgene) to be integrated into the endogenous locus and contain at least one target site for a nuclease, for example two paired nuclease binding sites separated by a "spacer" sequence separating near edges of binding sites. The spacer can be of any size, for example, between 4 and 20 base pairs (or any value therebetween). Donors having multiple nuclease target sites may have the same or different target sites, for example, two of the same paired sites flanking the transgene or two different paired sites flanking the transgene. The donor nucleotides do not require the presence of homology arms flanking the transgene sequence. The only chromosomal homology that may be present in the donor sequence is(are) the nuclease binding site(s). In embodiments in which the nuclease target sites exhibit homology to the genome, the homology to the genome is less than 50 to 100 (or any number of base pairs between 50 and 100) contiguous base pairs in length. In certain embodiments, where the nuclease used to cleave the donor is not the same as the nuclease used to cleave the chromosome, there may be no homology between the chromosomal locus cleaved by the nuclease(s) and the donor sequence. In addition, the nuclease target site(s) are not within the transgene and, as such, cleavage of the donor polynucleotide by the nuclease(s) that bind(s) to the target site(s) does not modify the transgene. In certain embodiments, the donor nucleic acid comprises two target sites and the spacer sequence between the two target sites is non-naturally occurring, for example when the spacer sequence does not occur in a genomic sequence between the two target sites present in the genome. In certain embodiments, the donor molecules are integrated into the endogenous locus via homology - independent mechanisms (e.g., NHEJ). In other embodiments, the double-stranded donor comprises a transgene of at least 1 kb in length and nuclease target site(s) 3' and/or 5' of the transgene for in vivo cleavage. In certain embodiments, the nuclease target site(s) used to cleave the donor are not re-created upon integration of the transgene, for example when the spacer between paired target sites is not present in and/or does not exhibit homology to an endogenous locus. The donor molecule may be, for example, a plasmid. In certain embodiments, the donor is integrated following nuclease-mediated cleavage of the endogenous locus. In any nucl ease-mediated integration of the donor molecule, the one or more of the nucleases used to cleave the donor may be the same as one or more of the nucleases used to cleave the endogenous locus. Alternatively, one or more of the nucleases used to cleave the donor may be different from one or more of the nucleases used to cleave the endogenous locus.

[0009] In some embodiments, the donor is contained on a plasmid. The donor may be integrated following nuclease-mediated cleavage where the sequence to be integrated (donor or transgene) is flanked in the plasmid by at least two nuclease cleavage sites. In other embodiments, the donor is contained on a plasmid, wherein the donor may be integrated following nuclease-mediated cleavage where the sequence to be integrated (donor or transgene) is the plasmid comprising a single nuclease cleavage site. In certain embodiments, the sequence of the nuclease cleavage sites in the donor plasmid is the same as the sequence of the nuclease cleavage site in the chromosomal locus to be targeted. In embodiments in which the cleavage sites are the same as between the donor and the genome, the sequences separating the cleavage sites may be the same or different. In certain embodiments, the sequences separating the cleavage sites (spacers) are different in the donor as compared to the genome such that following cleavage of the donor, the target sites is(are) not re-created and the donor cannot be cleaved again by the same nuclease(s). In other embodiments, the nuclease cleavage sites flanking the donor on the donor- containing plasmid are different from the cleavage site in the chromosome. In further embodiments, the nuclease cleavage sites flanking the donor in the donor-containing plasmid are not the same, and also may be different from the nuclease cleavage site in the chromosome. In further embodiments, the donor may be contained on a plasmid flanked by at least two nuclease cleavage sites and may be integrated into a deletion in the chromosome created by the action of two nucleases. In this embodiment, the nuclease cleavage sites flanking the donor on the plasmid and the nuclease cleavage sites in the chromosome may either be the same or may be different.

[0010] The sequence of interest of the donor molecule may comprise one or more sequences encoding a functional polypeptide (e.g., a cDNA), with or without a promoter. In certain embodiments, the nucleic acid sequence comprises a sequence encoding an antibody, an antigen, an enzyme, a growth factor, a receptor (cell surface or nuclear), a hormone, a lymphokine, a cytokine, a reporter, an insect resistant gene, a herbicide tolerance gene, a transcription factor, sequestration protein or functional fragments of any of the above and combinations of the above. The sequence of interest of the donor molecule may comprise one or more sequences that encode an RNA molecule that encodes a functional or structural RNA, for example, an RNAi, sRNAi, and/or mRNAi. In embodiments in which the functional polypeptide encoding sequences are promoterless, expression of the integrated sequence is then ensured by transcription driven by an endogenous promoter or other control element in the region of interest. In other embodiments, a "tandem" cassette is integrated into the selected site in this manner, the first component of the cassette comprising a promoterless sequence as described above, followed by a transcription termination sequence, and a second sequence, encoding an autonomous expression cassette.

Additional sequences (coding or non-coding sequences) may be included in the donor molecule, including but not limited to, sequences encoding a 2A peptide, SA site, IRES, etc. In certain embodiments, the donor nucleic acid (transgene) comprises sequences encoding functional RNAs for example, miRNAs or shRNAs.

[0 1 11 In another aspect, described herein are methods of integrating a donor nucleic acid (e.g., a donor molecule as described herein) into the genome of a cell via homology-independent mechanisms. The methods comprise creating a double- stranded break (DSB) in the genome of a cell and cleaving the donor molecule using one or more nucleases, such that the donor nucleic acid is integrated at the site of the DSB. In certain embodiments, the donor nucleic acid is integrated via non-homology dependent methods (e.g., NHEJ). As noted above, upon in vivo cleavage the donor sequences can be integrated in a targeted manner into the genome of a cell at the location of a DSB. The donor sequence can include one or more of the same target sites for one or more of the nucleases used to create the DSB. Thus, the donor sequence may be cleaved by one or more of the same nucleases used to cleave the endogenous gene into which integration is desired. In certain embodiments, the donor sequence includes different nuclease target sites from the nucleases used to induce the DSB. DSBs in the genome of the target cell may be created by any mechanism. In certain embodiments, the DSB is created by one or more (e.g., a dimerizing pair of) zinc-finger nucleases (ZFNs), fusion proteins comprising a zinc finger binding domain, which is engineered to bind a sequence within the region of interest, and a cleavage domain or a cleavage half-domain. In other embodiments, the DSB is created by one or more TALE DNA-binding domains (naturally occurring or non- naturally occurring) fused to a nuclease domain (TALEN). In still further

embodiments, cleavage is performed using a nuclease system such as CRISPR/Cas with an engineered crRNA/tracr RN A.

[0012] Furthermore, in any of the methods described herein, the first and second cleavage half-domains may be from a Type IIS restriction endonuclease, for example, Fokl or Stsl. Furthermore, in any of the methods described herein, at least one of the fusion proteins may comprise an alteration in the amino acid sequence of the dimerization interface of the cleavage half-domain, for example such that obligate heterodimers of the cleavage half-domains are formed. Alternatively, in any of the methods described herein the cleavage domain may be a naturally or non-naturally occurring (engineered) meganuclease.

[0013] In any of the methods described herein, the cell can be any eukaryotic cells, for example a plant cell or a mammalian cell or cell line, including COS, CHO (e.g., CHO-S, CHO-K1, CHO-DG44, CHO-DUXB11 , CHO-DUKX, CHOK1 SV), VERO, MDCK, WI38, V79, B14AF28-G3, BHK, HaK, NSO, SP2/0-Agl4, HeLa, HEK293 (e.g. , HEK293-F, HEK293-H, HEK293-T), and perC6 cells as well as insect cells such as Spodoptera fugiperda (Sf), or fungal cells such as Saccharomyces, Pichia and Schizosaccharomyces . In certain embodiments, the cell line is a CHO, MDCK or HEK293 cell line. Suitable cells also include stem cells such as, by way of example, embryonic stem cells, induced pluripotent stem cells, hematopoietic stem cells, neuronal stem cells and mesenchymal stem cells. Furthermore, the cell may be arrested in the G2 phase of the cell cycle. In some embodiments of the methods described herein, the cell may be one lacking efficient homology-based DNA repair, for example a CHO cell. In certain embodiments, the cells may be primary or non- dividing cells which preferentially use the NHEJ DNA repair pathway. In some embodiments, the cell can be a plant or fungal cell. In other embodiments, the methods described herein may be used in cells with unsequenced genomes. These cells can be used to create cell lines and/or transgenic organisms (e.g., animals or plants) bearing the transgene(s).

[0014] In another aspect, transgenic organisms (e.g., plants or animals) comprising a transgene integrated according to any of the methods described herein are provided. In one embodiment, a cell, cell line or transgenic organism carrying a heterozygous genotype for the selected gene is constructed, while in another embodiment, a homozygous cell, cell line or transgenic organism is made carrying two mutant copies in both alleles of a desired locus.

[0015] A kit, comprising the methods and compositions of the invention, is also provided. The kit may comprise the nucleases, (e.g. RNA molecules or ZFN, TALEN or CRISPR/Cas system encoding genes contained in a suitable expression vector), or aliquots of the nuclease proteins, donor molecules, suitable host cell lines, instructions for performing the methods of the invention, and the like. The kit may also comprise donor molecules of interest (e.g. reporter genes, specific transgenes and the like).

[0016] These and other aspects will be readily apparent to the skilled artisan in light of disclosure as a whole.

BRIEF DESCRIPTION OF THE DRAWINGS

[0017] Figure 1, panels A to F, show capture of a transgene cleaved in vivo at an AA VS1 locus in 562 cells. Figure 1A is a schematic depicting an exemplary donor molecule having two paired binding sites (4 binding sites total), with each pair separated by spacers), which sites flank a transgene to be integrated into the genome. The target sites may be the same or different. Figure IB is a schematic depicting four different in vivo donor cleavage techniques. In the first embodiment, when both the chromosome and the donor plasmid contain a ZFN cleavage site (dark grey region), cleavage of the donor and chromosome are synchronized, allowing efficient integration of the donor into the chromosome. Integration can occur in both forward and reverse orientations, termed "AB" and "BA," respectively. In the second embodiment, the donor contains more than one nuclease cleavage site. Nuclease action liberates a linear fragment of DNA which is integrated into the chromosome. In the third embodiment, the chromosome is cleaved by more than one nuclease and the donor by one nuclease, resulting in integration of the donor into a deletion in the chromosome. In the fourth embodiment, both the donor DNA and the chromosomal DNA are cleaved by more than one nuclease, resulting in integration of a linear fragment into a deletion in the chromosome. Figure 1 C shows a comparison of donors that favor the forward (AB) integration of the transgene and re-create the same nuclease target site (top panels) with donors the favor reverse (BA) integration of the transgene and do not re-create the same nuclease target site (bottom panels). The sequence between the binding sites (spacers) is underlined and the overhangs of the wild-type (top panels) and reverse-complement (bottom panels) spacers created after cleavage are shown on the right of the top and bottom panels. The sequences shown in the top left are SEQ ID NOs: 107 and 108. The sequences shown in the top right are SEQ ID NOs: 109 and 110. The sequences shown in the bottom left are SEQ ID NOs:l 11 and 112 and the sequences shown in the bottom right are SEQ ID NOs: 113 and 1 14. Figure ID is a gel showing detection of targeted integration from donors with wild-type ("w.t") sequences (or spacers) between the ZFN binding sites (the same spacer sequences as in the genome) and reverse-complement ("r.c") sequences between the ZFN binding sites. As shown, more signal is seen in the BA orientation with reverse-complement overhang-nucleotides Figure IE is a gel depicting integration of a donor plasmid at AA VS1 following in vivo cleavage by the AA VS- specific ZFNs. A donor plasmid either containing or lacking the AAVS1 ZFN site was co-transfected with the AA VS1 -specific ZFNs into K562 cells. Integration in both the AB and BA orientations was monitored by PCR of chromosome-donor junctions. Insertion in the AB orientation produces 391 and 423 bp PCR products for the left and right junctions, respectively; insertion in the BA orientation produces 369 and 471 bp junction PCR products for the left and right junctions, respectively. "Al" refers to the PCR reaction designed to amplify the AA VS1 ZFN site and "NT" refers to those reactions lacking DNA template. Figure IF depicts both schematics of the triploid AA VS1 locus and results of interrogation of targeted integration, as assayed by

Southern blot. Three clones were assayed in duplicate. Genomic DNA from the three clones was either cut with Bgll and probed with an AA VSl -specific probe (top gel) or cut with Accl and probed with a transgene specific probe (bla).

[0018] Figure 2, panels A and B, show that in vivo donor cleavage promotes transgene capture at several loci in two different cell types. Figure 2A shows that targeted integration via NHEJ at sites in the IL2Ry, CCR5, and glutamine synthetase (GS) loci is more efficient when the donor plasmid is co-cleaved with the

chromosome rather than being cut prior to transfection. One junction PCR for each orientation was performed for all three loci, the left junction for the BA orientation and the right junction for the AB orientation. The experimental conditions that were used are labeled as follows: "N" refers to transfections wherein the donor lacked a ZFN site; "ERV" refers to the sample in which the donor was pre-cut with EcoRV prior to introduction into the cell; "Y" refers to a transfection where the donor and the targeted gene both contained the ZFN site; and "NT" refers to PCR reactions with no template DNA. The amplicon size expected from PCR amplification of successfully integrated donors is shown below each lane in base pairs ("Expected size, bp"). The picture shown is a color-inverted image of an ethidium bromide-stained gel. Figure 2B depicts that donor cleavage does not need to be done with the same ZFNs as those used to cut the target site in the chromosome. Junction-specific PCR assays were used to detect transgene integration into the chromosomal target and to detect the orientation of the integrated transgenes. These assays demonstrated that the transgene could integrate in either orientation following ZFN cleavage. Experimental conditions are labeled as follows: GS, GS-specific ZFNs or donor plasmid with the GS ZFN cleavage site; ΑΙ, ΑΑ VS1 -specific ZFNs or donor with the AA FS ZFN cleavage site.

[0019] Figure 3, panels A to C, depict high-frequency targeted transgene integration at the GS locus in CHO-Kl cells. Figure 3A is a schematic of the GS locus showing the transgene integrated in the BA orientation. Figure 3B shows a Southern blot assay of cell clones for targeted integration of the donor at the GS locus. The exonic GS probe also detects two GS pseudogenes. The same panel of clones was assayed for total transgene integration by probing for the E. coli bla gene (Figure 3C). Integration of the transgene at the GS locus is seen along with transgene integration elsewhere in the genome in three of the eight clones analyzed.

[0020] Figure 4, panels A and C, depict disruption of alpha-( 1,6)- fucosyl transferase (FUT8) in CHO-Kl cells by ZFN- and TALEN-mediated targeted insertion of a monoclonal antibody transgene. Figure 4A depicts area-proportional Venn diagrams showing concordance between clones screened for transgene insertion by junction- specific PCR and for IgG expression. Figure 4B is a schematic of the FUT8 locus containing the inserted transgene (depicted in light grey), labeling the orientation of the transgene in either the "AB" or "BA" nomenclature, and Figure 4C shows Southern blot confirmation of integration at FUT8. Integrants containing the transgenes inserted in the BA and AB orientations are indicated.

[0021] Figure 5, panels A to C, show ZFN activity for experiments described in the Examples. Figure 5 A shows ZFN cleavage at the AAVS1 locus (top) and of the donor plasmid (bottom). The corresponding lanes in Figure 5A from Figure ID are indicated above the gel (ex.: "1,12"), as is the presence ("Y") or absence ("N") of the Surveyor™ nuclease enzyme. The percentage of molecules modified is shown below the lanes with signal. Figure 5B shows ZFN cleavage at IL2Ry, CCR5, and GS using the corresponding gene-specific ZFNs. As described above for 5A, the corresponding lanes in 5B from Figure 2A are shown above the gel, as is the presence ("Y") or absence ("N") of the Surveyor™ nuclease enzyme. While the gels across the top of Figure 5B depict the results from integration into the gene loci, the gels on the bottom of the figure depict the results of cleavage in the donors. Figure 5C shows ZFN cleavage at GS where the gel on the left depicts the results from the gene locus in CHO cells, while the gel on the right depicts the results of ZFN cleavage of the donor. As above, the corresponding lanes in Figure 5C from Figure 2B is shown above the gel, as is the presence ("Y") or absence ("N") of the Surveyor™ nuclease enzyme. Arrows indicate the expected cleavage products.

[0022] Figure 6 is a graph depicting homology-directed targeted integration of a GFP encoding transgene in HEK 293 and CHO-Kl cells. The percentage of cells that are GFP-positive is shown in light grey (HEK 293 cells) or dark grey (CHO-Kl cells). The amount of donor used is indicated below each grouping.

[0023] Figure 7 shows partial DNA sequence of junction PCR products from

CHO Kl clones into which in vivo cleaved donors were integrated into the

AA VS Hocus. Chromosomal sequence is shown in plain text, and donor sequence is shown in italics. ZFN binding sites are underlined and in bold. Microhomology is shaded in grey. The expected allele sequences in the AB orientation are shown across the top of the AB group and are defined as perfect ligation of the 5' overhangs; the expected allele sequences in the BA orientation are shown across the top of the BA group and are defined as removal of the 5' overhangs followed by ligation. Sequence identifiers are indicated in the Figure.

10024] Figure 8, panels A to C, show DNA sequences of junction PCR assays from CHO Kl cell pools with transgene integrations at AA VS 1 (Figure 8A), CCR5 (Figure 8B), - GS (Figure 8C), and IL2Ry (Figure 8C). Chromosomal sequence is shown in plain text, donor sequence is shown in italics. ZFN binding sites are underlined and in bold. The expected allele sequences are shown as above, and are also as defined above for Figure 7. Identical sequences isolated more than once are indicated and sequence identifiers are indicated in the Figure.

[0025] Figure 9 shows DNA sequences of junction PCRs from GS single cell- derived clones. Chromosomal sequence is shown in plain text, donor sequence is shown in italics. ZFN binding sites are underlined and in bold. The expected alleles in the AB orientation are defined as perfect ligation of the 5' overhangs; the expected alleles in the BA orientation are defined as removal of the 5' overhangs followed by ligation. Sequence identifiers are indicated in the Figure. [0026] Figure 10, panels A and B, show DNA sequences of junction PCRs from single cell-derived clones with integrations at FUT8. Chromosomal sequence is shown in plain text, donor sequence is shown in italics. Figure 10A shows integration following cleavage with LT5-tarageted ZFNs (ZFN binding sites are underlined and in bolded). Figure 10B shows integration following cleavage with LTS-tarageted TALENs (TALEN binding sites are underlined and bolded). Sequence identifiers are indicated in the Figure.

[0027] Figure 11 shows a plasmid map of pDAB109350.

[0028] Figure 12 shows a plasmid map of pDAB109360.

[0029] Figure 13 shows a plasmid map of pDAS000153.

[0030] Figure 14 shows a plasmid map of pDASOOOl 50.

[0031] Figure 15 shows a plasmid map of pDAS000143.

[0032] Figure 16 shows a plasmid map of pDAS000164.

[0033] Figure 17 shows a plasmid map of pDAS000433.

[0034] Figure 18 shows a plasmid map of pDAS000434.

[0035] Figure 19, panels A and B, depict exogenous marker-free, sequential transgene stacking at an endogenous AHAS locus in the wheat genome of Triticum aestivum using ZFN-mediated, NHEJ-directed DNA repair. Figure 19A depicts the first transgene stack; Figure 19B depicts the second transgene stack.

[0036] Figure 20, panels A and B, depict exogenous marker-free, sequential transgene stacking at an endogenous AHAS locus in the wheat genome of Triticum aestivum using ZFN-mediated, HDR-directed DNA repair. Figure 20A depicts the first transgene stack; Figure 20B depicts the second transgene stack.

[0037] Figure 21 shows a plasmid map of pDAS000435.

[0038] Figure 22 shows a plasmid map of pDAB 107827.

[0039] Figure 23 shows a plasmid map of pDAB 107828.

[0040] Figure 24 shows a plasmid map of pDAS000340.

[0041] Figure 25 shows a plasmid map of pDAS000341.

[0042] Figure 26 shows a plasmid map of pDAS000342.

[0043] Figure 27 shows a plasmid map of pDAS000343.

[0044] Figure 28, panels A and B, show the locations of the primers and their position relative to the start and stop codon of Fad3C. Figure 28A shows the location of the primer sites for the wild type Fad3C locus. Figure 28B shows the location of the primer sites to confirm donor integration, and the possible orientations by which the donor could integrate within the Fad3C locus.

[0045] Figure 29, panels A and B, shows sequences alignments of various targeted integrations. Figure 29 A shows a sequence alignment amplified from the junction of the tGFP cassette of pDAS000341 with Fad3C at the double strand break as recognized by ZFN 28051-2A-28052. Sequences shown are SEQ ID NOs:480 to 493 from top to bottom. The ":" indicates the deletions located at the cut sites. Figure 29B shows a sequence alignment amplified from the junction of the tGFP cassette of pDAS000343 with Fad3C at the double strand break as recognized by ZFN 28051- 2A-28052 and ZFN 28053-2A-28054. The ":" indicates the deletions located at the cut sites. Sequences shown are SEQ ID NOs:494 to 507 from top to bottom.

[0046] Figure 30, panels A and B, show a sequence alignment amplified from the junction of the hph cassette of pDAS000340 with FAD3C at the double strand break as recognized by ZFN 28051-2A-28052. The ":" indicates the deletions located at the cut sites. Sequences shown are SEQ ID NOs:508 to 523 from top to bottom. Figure 30A shows sequences for the 5' junction and the sequences shown in Figure 30B are for the 3' junction.

[0047] Figure 31, panels A and B, show a sequence alignment amplified from the junction of the hph cassette of pDAS00034 with FAD3C at the double strand break as recognized by ZFN 28051-2A-28052 and 28053-2A-28054. The " :" indicates the deletions located at the cut sites. Sequences shown are SEQ ID NOs:524 to 532 from top to bottom. The sequences shown in Figure 31 A are for the 5' junction and the sequences shown in Figure 3 IB are for the 3' junction.

[0048] Figure 32 depicts the relation of the ZFNs designed to bind the genomic locus of transgenic insert in Corn Event DAS-59132. Six ZFNs (E32 ZFN1-

6) were identified from the yeast assay and four ZFNs were advanced for evaluation in plants.

[0049] Figure 33 shows a pi asm id map of pDAB 1 05906.

[0050] Figure 34 shows a plasmid map of pDAB ! 1 1 809.

[00511 Figure 5 shows a plasmid map of pDAB 100655.

[0052] Figure 36 depicts a graph showing evaluation of transiently expressed

ZFNs in plants. Four ZFNs were evaluated in maize callus by transiently expressing the ZFNs and an internal control ZFN directed to the IPPK2 gene. After Next Generation Sequencing of PCR amplified fragments from the region surrounding the ZFN cleavage sites, the sequenced PCR amplified fragments were scored for the presence of sequence variants resulting from indels. The relative frequency of indels from each of the four E32 ZFN pairs as compared to IPPK2 ZFN activity are depicted. Event 32 ZFN6 which contains the 25716 (716) and 25717 (717) zinc finger binding domains cleaved the genomic locus of transgenic Corn Event DAS- 59132 at 380 times the efficiency of the control IPPK2 zinc finger nuclease.

[0053] Figure 37 depicts a graph of the ZFN locus disruption of Corn Event

DAS-59132.

[0054] Figure 38 is a schematic depicting the experimental system used for donor integration into the ELP of maize genome.

[0055] Figure 39 is a graph illustrating the cleavage of genomic target DNA by eZFNs. DNA was isolated from each treatment group (6 replicates each) as indicated. TAQMAN™ assays were used to measure cleavage of the target DNA. Cleavage activity of the eZFNs is relative to the Donor DNA alone treatments.

eZFNs (eZFNl and eZFN3) levels were 1 :1 or 1 :10 ratios relative to the Donor DNA. Statistical groupings are indicated by lower case letters.

[0056] Figure 40 illustrates the primer binding sites within the ELP loci of the corn genome.

[0057] Figure 41 illustrates the primer binding sites of the pDAB 100651 fragment for copy number evaluation.

[0058] Figure 42 shows the cleavage activity of the eZFNs is relative to the

Donor DNA alone treatments. The eZFN (eZFNl and eZFN3) cleavage levels were 1 :1 or 1 : 10 ratios relative to the donor DNA. Statistical groupings are indicated by lower case letters.

[0059] Figure 43, panels A and B, show the junction sequence from in-out

PCR reactions. The left and right sequences are partial sequences of the AADl and ELP, respectively. The sequence expected from an insertion restoring the eZFN binding site is shown in the blue font. The eZFN binding site is highlighted green and deletions are black bars. The sequence for the direct orientation (Figure 43 A) and the reverse orientation (Figure 43 B) are shown. The sequences are in blocks according to the PCR reaction from which they were cloned. DETAILED DESCRIPTION

[0060] Disclosed herein are compositions and methods for nuclease-mediated homology-independent (e.g., NHEJ capture) targeted integration of a transgene. While insertion of oligonucleotides can be performed via simple co-transfection of DNA with compatible 5' overhangs, it has now been shown that NHEJ capture of transgene-size fragments (e.g., > 0.5 kb) is greatly facilitated by in vivo nuclease- mediated cleavage of the donor plasmid in addition to cleavage of the chromosome. In this way, transgenes of larger size (e.g., between 1 and 14 kb or longer in length) can be integrated in a targeted manner into organisms and cell lines, such as Chinese hamster ovary (CHO) cells, which are recalcitrant to HDR-based integration. For example, in vivo donor cleavage allowed targeted integration at high frequency (6%) in unselected CHO cells, a cell type otherwise recalcitrant to targeted insertion of large DNA sequences.

[0061] Co-cleavage of the chromosome and transgene-containing double- stranded donor as described herein results in successful integration into any endogenous target locus in a selected host cell. The methods and compositions described herein allow for efficient non-homology-driven targeted integration that is not generally achievable by simple co-transfection of pre-cut donors.

[0062] Thus, the compositions and methods described herein allow for homology-independent targeted integration of large transgenes into sites of nuclease- cleavage, including into deletions created by engineered nucleases such as ZFNs and/or TALENs. Alternately, a donor plasmid with nuclease sites flanking a transgene to be integrated can be used such that the transgene portion is liberated upon nuclease cleavage and efficiently integrated at a targeted location. Further, use of the methods and compositions of the invention allow for nuclease-mediated in vivo cleavage of a large donor molecule such as a bacterial or yeast artificial chromosome permits the targeted integration of large transgenes in mammalian and plant cells. Finally, the in vivo cleavage compositions and methods described will find use in the targeted genetic modification of other organisms and cells, especially those which perform homology-direct DNA repair poorly.

General

[0063] Practice of the methods, as well as preparation and use of the compositions disclosed herein employ, unless otherwise indicated, conventional techniques in molecular biology, biochemistry, chromatin structure and analysis, computational chemistry, cell culture, recombinant DNA and related fields as are within the skill of the art. These techniques are fully explained in the literature. See, for example, Sambrook et al. MOLECULAR CLONING: A LABORATORY MANUAL, Second edition, Cold Spring Harbor Laboratory Press, 1989 and Third edition, 2001 ; Ausubel et al, CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John Wiley & Sons, New York, 1987 and periodic updates; the series METHODS IN ENZYMOLOGY, Academic Press, San Diego; Wolffe, CHROMATIN STRUCTURE AND FUNCTION, Third edition, Academic Press, San Diego, 1998; METHODS IN ENZYMOLOGY, Vol. 304, "Chromatin" (P.M. Wassarman and A. P. Wolffe, eds.), Academic Press, San Diego, 1999; and METHODS IN MOLECULAR BIOLOGY, Vol. 119, "Chromatin Protocols" (P.B. Becker, ed.) Humana Press, Totowa, 1999.

Definitions

[0064] The terms "nucleic acid," "polynucleotide," and "oligonucleotide" are used interchangeably and refer to a deoxyribonucleotide or ribonucleotide polymer, in linear or circular conformation, and in either single- or double-stranded form. For the purposes of the present disclosure, these terms are not to be construed as limiting with respect to the length of a polymer. The terms can encompass known analogues of natural nucleotides, as well as nucleotides that are modified in the base, sugar and/or phosphate moieties (e.g., phosphorothioate backbones). In general, an analogue of a particular nucleotide has the same base-pairing specificity; i. e. , an analogue of A will base-pair with T.

[0065] The terms "polypeptide," "peptide" and "protein" are used

interchangeably to refer to a polymer of amino acid residues. The term also applies to amino acid polymers in which one or more amino acids are chemical analogues or modified derivatives of a corresponding naturally-occurring amino acids.

[0066] "Binding" refers to a sequence-specific, non-covalent interaction between macromolecules (e.g. , between a protein and a nucleic acid). Not all components of a binding interaction need be sequence-specific (e.g., contacts with phosphate residues in a DNA backbone), as long as the interaction as a whole is sequence-specific. Such interactions are generally characterized by a dissociation constant (¾) of 10^"6 M^{" 1} or lower. "Affinity" refers to the strength of binding:

increased binding affinity being correlated with a lower ¾.

[0067] A "binding protein" is a protein that is able to bind non-covalently to another molecule. A binding protein can bind to, for example, a DNA molecule (a DNA-binding protein), an RNA molecule (an RNA-binding protein) and/or a protein molecule (a protein-binding protein). In the case of a protein-binding protein, it can bind to itself (to form homodimers, homotrimers, etc.) and/or it can bind to one or more molecules of a different protein or proteins. A binding protein can have more than one type of binding activity. For example, zinc fmger proteins have DNA- binding, RNA-binding and protein-binding activity.

[0068] A "zinc fmger DNA binding protein" (or binding domain) is a protein, or a domain within a larger protein, that binds DNA in a sequence- specific manner through one or more zinc fingers, which are regions of amino acid sequence within the binding domain whose structure is stabilized through coordination of a zinc ion.

The term zinc fmger DNA binding protein is often abbreviated as zinc fmger protein or ZFP.

[0069] A "TALE DNA binding domain" or "TALE" is a polypeptide comprising one or more TALE repeat domains/units. The repeat domains are involved in binding of the TALE to its cognate target DNA sequence. A single "repeat unit" (also referred to as a "repeat") is typically 33-35 amino acids in length and exhibits at least some sequence homology with other TALE repeat sequences within a naturally occurring TALE protein. See, e.g., U.S. Patent Publication No. 20110301073, incorporated by reference herein in its entirety.

[0070] Zinc finger and TALE binding domains can be "engineered" to bind to a predetermined nucleotide sequence, for example via engineering (altering one or more amino acids) of the recognition helix region of a naturally occurring zinc finger. Similarly, TALEs can be "engineered" to bind to a predetermined nucleotide

sequence, for example by engineering of the amino acids involved in DNA binding (the repeat variable diresidue or RVD region). Therefore, engineered DNA binding proteins (zinc fingers or TALEs) are proteins that are non-naturally occurring. Non- limiting examples of methods for engineering DNA-binding proteins are design and selection. A designed DNA binding protein is a protein not occurring in nature whose design/composition results principally from rational criteria. Rational criteria for design include application of substitution rules and computerized algorithms for processing information in a database storing information of existing ZFP and/or TALE designs and binding data. See, for example, U.S. Patents 6,140,081 ;

6,453,242; and 6,534,261; see also WO 98/53058; WO 98/53059; WO 98/53060; WO 02/016536 and WO 03/016496 and U.S. Publication Nos. 201 10301073, 20110239315 and 20119145940.

[0071] A "selected" zinc finger protein or TALE is a protein not found in nature whose production results primarily from an empirical process such as phage display, interaction trap or hybrid selection. See e.g., U.S. Patent Nos. 5,789,538; US 5,925,523; US 6,007,988; US 6,013,453; US 6,200,759; WO 95/19431 ;

WO 96/06166; WO 98/53057; WO 98/54311 ; WO 00/27878; WO 01/60970 WO 01/88197 and WO 02/099084 and U.S. Publication Nos. 20110301073,

20110239315 and 20119145940.

[0072] "Recombination" refers to a process of exchange of genetic

information between two polynucleotides, including but not limited to, donor capture by non-homologous end joining (NHEJ) and homologous recombination. For the purposes of this disclosure, "homologous recombination (HR)" refers to the specialized form of such exchange that takes place, for example, during repair of double-strand breaks in cells via homology-directed repair mechanisms. This process requires nucleotide sequence homology, uses a "donor" molecule to template repair of a "target" molecule (z. e. , the one that experienced the double-strand break), and is variously known as "non-crossover gene conversion" or "short tract gene conversion," because it leads to the transfer of genetic information from the donor to the target. Without wishing to be bound by any particular theory, such transfer can involve mismatch correction of heteroduplex DNA that forms between the broken target and the donor, and/or "synthesis-dependent strand annealing," in which the donor is used to resynthesize genetic information that will become part of the target, and/or related processes. Such specialized HR often results in an alteration of the sequence of the target molecule such that part or all of the sequence of the donor polynucleotide is incorporated into the target polynucleotide. For HR-directed integration, the donor molecule contains at least 2 regions of homology to the genome ("homology arms") of least 50-100 base pairs in length. See, e.g., U.S. Patent Publication No.

201 10281361. [0073] In the methods of the disclosure, one or more targeted nucleases as described herein create a double-stranded break in the target sequence (e.g., cellular chromatin) at a predetermined site, and a "donor" polynucleotide, having homology to the nucleotide sequence in the region of the break, can be introduced into the cell. The presence of the double-stranded break has been shown to facilitate integration of the donor sequence. The donor sequence may be physically integrated or,

alternatively, the donor polynucleotide is used as a template for repair of the break via homologous recombination, resulting in the introduction of all or part of the nucleotide sequence as in the donor into the cellular chromatin. Thus, a first sequence in cellular chromatin can be altered and, in certain embodiments, can be converted into a sequence present in a donor polynucleotide. Thus, the use of the terms

"replace" or "replacement" can be understood to represent replacement of one nucleotide sequence by another, (i. e. , replacement of a sequence in the informational sense), and does not necessarily require physical or chemical replacement of one polynucleotide by another.

[0074] In any of the methods described herein, additional pairs of zinc-finger proteins or TALEN can be used for additional double-stranded cleavage of additional target sites within the cell.

[0075] Any of the methods described herein can be used for insertion of a donor of any size and/or partial or complete inactivation of one or more target sequences in a cell by targeted integration of donor sequence that disrupts expression of the gene(s) of interest. Cell lines with partially or completely inactivated genes are also provided.

[0076] Furthermore, the methods of targeted integration as described herein can also be used to integrate one or more exogenous sequences. The exogenous nucleic acid sequence can comprise, for example, one or more genes or cDNA molecules, or any type of coding or noncoding sequence, as well as one or more control elements (e.g., promoters). In addition, the exogenous nucleic acid sequence (transgene) may produce one or more RNA molecules (e.g., small hairpin R As (shRNAs), inhibitory RNAs (RNAis), microRNAs (miRNAs), etc.).

[0077] "Cleavage" refers to the breakage of the covalent backbone of a DNA molecule. Cleavage can be initiated by a variety of methods including, but not limited to, enzymatic or chemical hydrolysis of a phosphodiester bond. Both single-stranded cleavage and double-stranded cleavage are possible, and double-stranded cleavage can occur as a result of two distinct single-stranded cleavage events. DNA cleavage can result in the production of either blunt ends or staggered ends. In certain embodiments, fusion polypeptides are used for targeted double-stranded DNA cleavage.

[0078] A "cleavage half-domain" is a polypeptide sequence which, in conjunction with a second polypeptide (either identical or different) forms a complex having cleavage activity (preferably double-strand cleavage activity). The terms "first and second cleavage half-domains;" "+ and - cleavage half-domains" and "right and left cleavage half-domains" are used interchangeably to refer to pairs of cleavage half- domains that dimerize.

[0079] An "engineered cleavage half-domain" is a cleavage half-domain that has been modified so as to form obligate heterodimers with another cleavage half- domain (e.g., another engineered cleavage half-domain). See, also, U.S. Patent Publication Nos. 2005/0064474, 20070218528 and 2008/0131962, incorporated herein by reference in their entireties.

[0080] The term "sequence" refers to a nucleotide sequence of any length, which can be DNA or RNA; can be linear, circular or branched and can be either single-stranded or double stranded. The term "donor sequence" or "transgene" refers to a nucleotide sequence that is inserted into a genome. A donor sequence can be of any length, for example between 2 and 100,000,000 nucleotides in length (or any integer value therebetween or thereabove), preferably between about 100 and 100,000 nucleotides in length (or any integer therebetween), more preferably between about 2000 and 60,000 nucleotides in length (or any value therebetween) and even more preferable, between about 3 and 15 kb (or any value therebetween).

[0081] "Chromatin" is the nucleoprotein structure comprising the cellular genome. Cellular chromatin comprises nucleic acid, primarily DNA, and protein, including histones and non-histone chromosomal proteins. The majority of eukaryotic cellular chromatin exists in the form of nucleosomes, wherein a nucleosome core comprises approximately 150 base pairs of DNA associated with an octamer comprising two each of histones H2A, H2B, H3 and H4; and linker DNA (of variable length depending on the organism) extends between nucleosome cores. A molecule of histone HI is generally associated with the linker DNA. For the purposes of the present disclosure, the term "chromatin" is meant to encompass all types of cellular nucleoprotein, both prokaryotic and eukaryotic. Cellular chromatin includes both chromosomal and episomal chromatin.

[0082] A "chromosome," is a chromatin complex comprising all or a portion of the genome of a cell. The genome of a cell is often characterized by its karyotype, which is the collection of all the chromosomes that comprise the genome of the cell. The genome of a cell can comprise one or more chromosomes.

[0083] An "episome" is a replicating nucleic acid, nucleoprotein complex or other structure comprising a nucleic acid that is not part of the chromosomal karyotype of a cell. Examples of episomes include plasmids and certain viral genomes.

[0084] A "target site" or "target sequence" is a nucleic acid sequence that defines a portion of a nucleic acid to which a binding molecule will bind, provided sufficient conditions for binding exist.

[0085] An "exogenous" molecule is a molecule that is not normally present in a cell, but can be introduced into a cell by one or more genetic, biochemical or other methods. "Normal presence in the cell" is determined with respect to the particular developmental stage and environmental conditions of the cell. Thus, for example, a molecule that is present only during embryonic development of muscle is an exogenous molecule with respect to an adult muscle cell. Similarly, a molecule induced by heat shock is an exogenous molecule with respect to a non-heat-shocked cell. An exogenous molecule can comprise, for example, a functioning version of a malfunctioning endogenous molecule or a malfunctioning version of a normally- functioning endogenous molecule.

[0086] An exogenous molecule can be, among other things, a small molecule, such as is generated by a combinatorial chemistry process, or a macromolecule such as a protein, nucleic acid, carbohydrate, lipid, glycoprotein, lipoprotein,

polysaccharide, any modified derivative of the above molecules, or any complex comprising one or more of the above molecules. Nucleic acids include DNA and RNA, can be single- or double-stranded; can be linear, branched or circular; and can be of any length. Nucleic acids include those capable of forming duplexes, as well as triplex-forming nucleic acids. See, for example, U.S. Patent Nos. 5,176,996 and 5,422,251. Proteins include, but are not limited to, DNA-binding proteins, transcription factors, chromatin remodeling factors, methylated DNA binding proteins, polymerases, methylases, demethylases, acetylases, deacetylases, kinases, phosphatases, integrases, recombinases, ligases, topoisomerases, gyrases and helicases.

[0087] An exogenous molecule can be the same type of molecule as an endogenous molecule, e.g., an exogenous protein or nucleic acid. For example, an exogenous nucleic acid can comprise an infecting viral genome, a plasmid or episome introduced into a cell, or a chromosome that is not normally present in the cell.

Methods for the introduction of exogenous molecules into cells are known to those of skill in the art and include, but are not limited to, lipid-mediated transfer (z. e. , liposomes, including neutral and cationic lipids), electroporation, direct injection, cell fusion, particle bombardment, calcium phosphate co-precipitation, DEAE-dextran- mediated transfer and viral vector-mediated transfer. An exogenous molecule can also be the same type of molecule as an endogenous molecule but derived from a different species than the cell is derived from. For example, a human nucleic acid sequence may be introduced into a cell line originally derived from a mouse or hamster.

[0088] By contrast, an "endogenous" molecule is one that is normally present in a particular cell at a particular developmental stage under particular environmental conditions. For example, an endogenous nucleic acid can comprise a chromosome, the genome of a mitochondrion, chloroplast or other organelle, or a naturally- occurring episomal nucleic acid. Additional endogenous molecules can include proteins, for example, transcription factors and enzymes.

[0089] A "fusion" molecule is a molecule in which two or more subunit molecules are linked, preferably covalently. The subunit molecules can be the same chemical type of molecule, or can be different chemical types of molecules.

Examples of the first type of fusion molecule include, but are not limited to, fusion proteins (for example, a fusion between a ZFP or TALE DNA-binding domain and one or more activation domains) and fusion nucleic acids (for example, a nucleic acid encoding the fusion protein described supra). Examples of the second type of fusion molecule include, but are not limited to, a fusion between a triplex-forming nucleic acid and a polypeptide, and a fusion between a minor groove binder and a nucleic acid. [0090] Expression of a fusion protein in a cell can result from delivery of the fusion protein to the cell or by delivery of a polynucleotide encoding the fusion protein to a cell, wherein the polynucleotide is transcribed, and the transcript is translated, to generate the fusion protein. Trans-splicing, polypeptide cleavage and polypeptide ligation can also be involved in expression of a protein in a cell. Methods for polynucleotide and polypeptide delivery to cells are presented elsewhere in this disclosure.

[0091] A "gene," for the purposes of the present disclosure, includes a DNA region encoding a gene product (see infra), as well as all DNA regions which regulate the production of the gene product, whether or not such regulatory sequences are adjacent to coding and/or transcribed sequences. Accordingly, a gene includes, but is not necessarily limited to, promoter sequences, terminators, translational regulatory sequences such as ribosome binding sites and internal ribosome entry sites, enhancers, silencers, insulators, boundary elements, replication origins, matrix attachment sites and locus control regions.

[0092] "Gene expression" refers to the conversion of the information, contained in a gene, into a gene product. A gene product can be the direct

transcriptional product of a gene (e.g. , mRNA, tR A, rRNA, antisense RNA, ribozyme, structural RNA or any other type of RNA) or a protein produced by translation of an mRNA. Gene products also include RNAs which are modified, by processes such as capping, polyadenylation, methylation, and editing, and proteins modified by, for example, methylation, acetylation, phosphorylation, ubiquitination, ADP-ribosylation, myristilation, and glycosylation.

10093] "Modulation" of gene expression refers to a change in the activity of a gene. Modulation of expression can include, but is not limited to, gene activation and gene repression. Genome editing (e.g., cleavage, alteration, inactivation, random mutation) can be used to modulate expression. Gene inactivation refers to any reduction in gene expression as compared to a cell that does not include a ZFP as described herein. Thus, gene inactivation may be partial or complete.

[0094] A "region of interest" is any region of cellular chromatin, such as, for example, a gene or a non-coding sequence within or adjacent to a gene, in which it is desirable to bind an exogenous molecule. Binding can be for the purposes of targeted DNA cleavage and/or targeted recombination. A region of interest can be present in a chromosome, an episome, an organellar genome (e.g. , mitochondrial, chloroplast), or an infecting viral genome, for example. A region of interest can be within the coding region of a gene, within transcribed non-coding regions such as, for example, leader sequences, trailer sequences or introns, or within non-transcribed regions, either upstream or downstream of the coding region. A region of interest can be as small as a single nucleotide pair or up to 2,000 nucleotide pairs in length, or any integral value of nucleotide pairs.

[0095] "Eukaryotic" cells include, but are not limited to, fungal cells (such as yeast), plant cells, animal cells, mammalian cells and human cells (e.g., T-cells).

[0096] The terms "operative linkage" and "operatively linked" (or "operably linked") are used interchangeably with reference to a juxtaposition of two or more components (such as sequence elements), in which the components are arranged such that both components function normally and allow the possibility that at least one of the components can mediate a function that is exerted upon at least one of the other components. By way of illustration, a transcriptional regulatory sequence, such as a promoter, is operatively linked to a coding sequence if the transcriptional regulatory sequence controls the level of transcription of the coding sequence in response to the presence or absence of one or more transcriptional regulatory factors. A

transcriptional regulatory sequence is generally operatively linked in cis with a coding sequence, but need not be directly adjacent to it. For example, an enhancer is a transcriptional regulatory sequence that is operatively linked to a coding sequence, even though they are not contiguous.

[0097] With respect to fusion polypeptides, the term "operatively linked" can refer to the fact that each of the components performs the same function in linkage to the other component as it would if it were not so linked. For example, with respect to a fusion polypeptide in which a ZFP, TALE or Cas DNA-binding domain is fused to an activation domain, the ZFP, TALE or Cas DNA-binding domain and the activation domain are in operative linkage if, in the fusion polypeptide, the ZFP, TALE or Cas DNA-binding domain portion is able to bind its target site and/or its binding site, while the activation domain is able to upregulate gene expression. When a fusion polypeptide in which a ZFP, TALE or Cas DNA-binding domain is fused to a cleavage domain, the ZFP, TALE or Cas DNA-binding domain and the cleavage domain are in operative linkage if, in the fusion polypeptide, the ZFP, TALE or Cas DNA-binding domain portion is able to bind its target site and/or its binding site, while the cleavage domain is able to cleave DNA in the vicinity of the target site.

[0098] A "functional fragment" of a protein, polypeptide or nucleic acid is a protein, polypeptide or nucleic acid whose sequence is not identical to the full-length protein, polypeptide or nucleic acid, yet retains the same function as the full-length protein, polypeptide or nucleic acid. A functional fragment can possess more, fewer, or the same number of residues as the corresponding native molecule, and/or can contain one or more amino acid or nucleotide substitutions. Methods for determining the function of a nucleic acid (e.g., coding function, ability to hybridize to another nucleic acid) are well-known in the art. Similarly, methods for determining protein function are well-known. For example, the DNA-binding function of a polypeptide can be determined, for example, by filter-binding, electrophoretic mobility-shift, or immunoprecipitation assays. DNA cleavage can be assayed by gel electrophoresis. See Ausubel et al, supra. The ability of a protein to interact with another protein can be determined, for example, by co-immunoprecipitation, two-hybrid assays or complementation, both genetic and biochemical. See, for example, Fields et al.

(1989) Nature 340:245-246; U.S. Patent No. 5,585,245 and PCT WO 98/44350.

[0099] A "vector" is capable of transferring gene sequences to target cells.

Typically, "vector construct," "expression vector," and "gene transfer vector," mean any nucleic acid construct capable of directing the expression of a gene of interest and which can transfer gene sequences to target cells. Thus, the term includes cloning, and expression vehicles, as well as integrating vectors.

[0100] A "reporter gene" or "reporter sequence" refers to any sequence that produces a protein product that is easily measured, preferably although not necessarily in a routine assay. Suitable reporter genes include, but are not limited to, sequences encoding proteins that mediate antibiotic resistance (e.g., ampicillin resistance, neomycin resistance, G418 resistance, puromycin resistance), sequences encoding colored or fluorescent or luminescent proteins (e.g., green fluorescent protein, enhanced green fluorescent protein, red fluorescent protein, luciferase), and proteins which mediate enhanced cell growth and/or gene amplification (e.g., dihydrofolate reductase). Epitope tags include, for example, one or more copies of FLAG, His, myc, Tap, HA or any detectable amino acid sequence. "Expression tags" include sequences that encode reporters that may be operably linked to a desired gene sequence in order to monitor expression of the gene of interest.

[0101] A "safe harbor" locus is a locus within the genome wherein a gene may be inserted without any deleterious effects on the host cell. Most beneficial is a safe harbor locus in which expression of the inserted gene sequence is not perturbed by any read-through expression from neighboring genes. Non-limiting examples of safe harbor loci in mammalian cells are the AAVSl gene (see U.S. Patent No.

8,110,379), the CCR5 gene (see U.S. Publication No. 20080159996), the Rosa locus (see WO 2010/065123) and/or the albumin locus (U.S. Application No. 13/624,193). Non-limiting examples of safe harbor loci in plant cells are the ZP15 locus ( U.S. Patent No 8,329,986)

Nucleases

[0102] Described herein are compositions, particularly nucleases, that are useful for in vivo cleavage of a donor molecule carrying a transgene and nucleases for cleavage of the genome of a cell such that the transgene is integrated into the genome in a targeted manner. In certain embodiments, one or more of the nucleases are naturally occurring. In other embodiments, one or more of the nucleases are non- naturally occurring, i.e., engineered in the DNA-binding domain and/or cleavage domain. For example, the DNA-binding domain of a naturally-occurring nuclease may be altered to bind to a selected target site {e.g., a meganuclease that has been engineered to bind to site different than the cognate binding site). In other embodiments, the nuclease comprises heterologous DNA-binding and cleavage domains {e.g., zinc finger nucleases; TAL-effector domain DNA binding proteins; meganuclease DNA-binding domains with heterologous cleavage domains).

A. DNA-binding domains

[0103] In certain embodiments, the composition and methods described herein employ a meganuclease (homing endonuclease) DNA-binding domain for binding to the donor molecule and/or binding to the region of interest in the genome of the cell. Naturally-occurring meganucleases recognize 15-40 base-pair cleavage sites and are commonly grouped into four families: the LAGLIDADG family, the GIY-YIG family, the His-Cyst box family and the HNH family. Exemplary homing endonucleases include l-Scel, l-Ceul, Fl-Pspl, Vl-Sce, I-ScelV, l-Csml, l-Panl, I- Scell, I-Ppol, I-Sceffl, l-Crel, 1-Tevl, I-7¾vII and I-7evIII. Their recognition sequences are known. See also U.S. Patent No. 5,420,032; U.S. Patent No.

6,833,252; Belfort et al. (1997) Nucleic Acids Res. 25:3379-3388; Dujon ei a/. (1989) Gene 82: 115-118; Perler et /. (1994) Nucleic Acids Res. 22, 1125-1127; Jasin (1996) Trends Genet. 12:224-228; Gimble et al. (1996) J Mol. Biol. 263: 163- 180; Argast et al. (1998) J Mol. Biol. 280:345-353 and the New England Biolabs catalogue.

[0104] In certain embodiments, the methods and compositions described herein make use of a nuclease that comprises an engineered (non-naturally occurring) homing endonuclease (meganuclease). The recognition sequences of homing endonucleases and meganucleases such as l-Scel, 1-Ceul, Vl-Pspl, PI-<Sce, 1-SceTV, I- Csml, l-Panl, I-Scell, l-Ppol, 1-SceiW, l-Crel, l-Tevl, I-7evII and I-7¾vIII are known. See also U.S. Patent No. 5,420,032; U.S. Patent No. 6,833,252; Belfort et al. (1997) Nucleic Acids Res. 25:3379-3388; Dujon ei a/. (1989) Gene 82:115-118; Perler et al. (1994) Nucleic Acids Res. 22, 1125-1127; Jasin (1996) Trends Genet. 12:224- 228; Gimble et al. (1996) J. Mol. Biol. 263: 163-180; Argast et al. (1998) J. Mol. Biol. 280:345-353 and the New England Biolabs catalogue. In addition, the DNA- binding specificity of homing endonucleases and meganucleases can be engineered to bind non-natural target sites. See, for example, Chevalier et al. (2002) Molec. Cell 10:895-905; Epinat et al. (2003) Nucleic Acids Res. 31:2952-2962; Ashworth et al. (2006) Nature 441:656-659; Paques et al. (2007) Current Gene Therapy 7:49-66; U.S. Patent Publication No. 20070117128. The DNA-binding domains of the homing endonucleases and meganucleases may be altered in the context of the nuclease as a whole (i.e., such that the nuclease includes the cognate cleavage domain) or may be fused to a heterologous cleavage domain.

[0105] In other embodiments, the DNA-binding domain of one or more of the nucleases used in the methods and compositions described herein comprises a naturally occurring or engineered (non-naturally occurring) TAL effector DNA binding domain. See, e.g., U.S. Patent Publication No. 20110301073, incorporated by reference in its entirety herein. The plant pathogenic bacteria of the genus

Xanthomonas are known to cause many diseases in important crop plants.

Pathogenicity of Xanthomonas depends on a conserved type III secretion (T3S) system which injects more than 25 different effector proteins into the plant cell.

Among these injected proteins are transcription activator-like (TAL) effectors which mimic plant transcriptional activators and manipulate the plant transcriptome (see Kay et al (2007) Science 318:648-651). These proteins contain a DNA binding domain and a transcriptional activation domain. One of the most well characterized TAL- effectors is AvrBs3 from Xanthomonas campestgris pv. Vesicatoria (see Bonas et al (1989) Mol Gen Genet 218: 127-136 and WO2010079430). TAL-effectors contain a centralized domain of tandem repeats, each repeat containing approximately 34 amino acids, which are key to the DNA binding specificity of these proteins. In addition, they contain a nuclear localization sequence and an acidic transcriptional activation domain (for a review see Schornack S, et al (2006) J Plant Physiol 163(3): 256-272). In addition, in the phytopathogenic bacteria Ralstonia solanacearum two genes, designated brgl 1 and hpxl7 have been found that are homologous to the AvrBs3 family of Xanthomonas in the R. solanacearum biovar 1 strain GMI1000 and in the biovar 4 strain RS 1000 (See Heuer et al (2007) Appl and Envir Micro 73(13): 4379- 4384). These genes are 98.9% identical in nucleotide sequence to each other but differ by a deletion of 1,575 bp in the repeat domain of hpxl7. However, both gene products have less than 40% sequence identity with AvrBs3 family proteins of Xanthomonas. See, e.g., U.S. Patent Publication No s. 20110239315, 20110145940 and

20110301073, incorporated by reference in its entirety herein.

[0106] Specificity of these TAL effectors depends on the sequences found in the tandem repeats. The repeated sequence comprises approximately 102 bp and the repeats are typically 91-100% homologous with each other (Bonas et al, ibid).

Polymorphism of the repeats is usually located at positions 12 and 13 and there appears to be a one-to-one correspondence between the identity of the hypervariable diresidues at positions 12 and 13 with the identity of the contiguous nucleotides in the TAL-effector's target sequence (see Moscou and Bogdanove, (2009) Science

326:1501 and Boch et al (2009) Science 326:1509-1512). Experimentally, the natural code for DNA recognition of these TAL-effectors has been determined such that an HD sequence at positions 12 and 13 leads to a binding to cytosine (C), NG binds to T, NI to A, C, G or T, NN binds to A or G, and TNG binds to T. These DNA binding repeats have been assembled into proteins with new combinations and numbers of repeats, to make artificial transcription factors that are able to interact with new sequences and activate the expression of a non-endogenous reporter gene in plant cells (Boch et al, ibid). Engineered TAL proteins have been linked to a Fokl cleavage half domain to yield a TAL effector domain nuclease fusion (TALEN) exhibiting activity in a yeast reporter assay (plasmid based target). See, e.g., U.S. Patent Publication No. 20110301073; Christian et al ((2010)< Genetics epub

10.1534/genetics.110.120717).

[0107] In other embodiments, the nuclease is a system comprising the

CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats)/Cas (CRISPR Associated) nuclease system. The CRISPR/Cas is an engineered nuclease system based on a bacterial system that can be used for genome engineering. It is based on part of the adaptive immune response of many bacteria and archea. When a virus or plasmid invades a bacterium, segments of the invader's DNA are converted into CRISPR RNAs (crRNA) by the 'immune' response. This crRNA then associates, through a region of partial complementarity, with another type of RNA called tracrRNA to guide the Cas9 nuclease to a region homologous to the crRNA in the target DNA called a "protospacer". Cas9 cleaves the DNA to generate blunt ends at the DSB at sites specified by a 20-nucleotide guide sequence contained within the crRNA transcript. Cas9 requires both the crRNA and the tracrRNA for site specific DNA recognition and cleavage. This system has now been engineered such that the crRNA and tracrRNA can be combined into one molecule (the "single guide RNA"), and the crRNA equivalent portion of the single guide RNA can be engineered to guide the Cas9 nuclease to target any desired sequence (see Jinek et al (2012) Science 337, p. 816-821, Jinek et al, (2013), eLife 2:e00471, and David Segal, (2013) eLife 2:e00563). Thus, the CRISPR/Cas system can be engineered to create a DSB at a desired target in a genome, and repair of the DSB can be influenced by the use of repair inhibitors to cause an increase in error prone repair.

[01 81 In certain embodiments, the DNA binding domain of one or more of the nucleases used for in vivo cleavage and/or targeted cleavage of the genome of a cell comprises a zinc finger protein. Preferably, the zinc finger protein is non- naturally occurring in that it is engineered to bind to a target site of choice. See, for example, See, for example, Beerli et al. (2002) Nature Biotechnol. 20: 135- 141 ; Pabo et al. (2001) Ann. Rev. Biochem. 70:313-340; Isalan et al. (2001) Nature Biotechnol. 19:656-660; Segal et al. (2001) Curr. Opin. Biotechnol. 12:632-637; Choo et al. (2000) Curr. Opin. Struct. Biol. 10:411-416; U.S. Patent Nos. 6,453,242; 6,534,261 ; 6,599,692; 6,503,717; 6,689,558; 7,030,215; 6,794,136; 7,067,317; 7,262,054;

7,070,934; 7,361,635; 7,253,273; and U.S. Patent Publication Nos. 2005/0064474; 2007/0218528; 2005/0267061, all incorporated herein by reference in their entireties.

[0109] An engineered zinc finger binding domain can have a novel binding specificity, compared to a naturally-occurring zinc fmger protein. Engineering methods include, but are not limited to, rational design and various types of selection. Rational design includes, for example, using databases comprising triplet (or quadruplet) nucleotide sequences and individual zinc fmger amino acid sequences, in which each triplet or quadruplet nucleotide sequence is associated with one or more amino acid sequences of zinc fingers which bind the particular triplet or quadruplet sequence. See, for example, co-owned U.S. Patents 6,453,242 and 6,534,261 , incorporated by reference herein in their entireties.

[0110] Exemplary selection methods, including phage display and two-hybrid systems, are disclosed in US Patents 5,789,538; 5,925,523; 6,007,988; 6,013,453; 6,410,248; 6,140,466; 6,200,759; and 6,242,568; as well as WO 98/37186;

WO 98/53057; WO 00/27878; WO 01/88197 and GB 2,338,237. In addition, enhancement of binding specificity for zinc finger binding domains has been described, for example, in co-owned WO 02/077227.

[0111] In addition, as disclosed in these and other references, zinc finger domains and/or multi-fingered zinc fmger proteins may be linked together using any suitable linker sequences, including for example, linkers of 5 or more amino acids in length. See, also, U.S. Patent Nos. 6,479,626; 6,903,185; and 7,153,949 for exemplary linker sequences 6 or more amino acids in length. The proteins described herein may include any combination of suitable linkers between the individual zinc fingers of the protein.

[0112] Selection of target sites; ZFPs and methods for design and construction of fusion proteins (and polynucleotides encoding same) are known to those of skill in the art and described in detail in U.S. Patent Nos. 6,140,081 ; 5,789,538; 6,453,242; 6,534,261; 5,925,523; 6,007,988; 6,013,453; 6,200,759; WO 95/19431 ;

WO 96/06166; WO 98/53057; WO 98/5431 1 ; WO 00/27878; WO 01/60970 WO 01/88197; WO 02/099084; WO 98/53058; WO 98/53059; WO 98/53060;

WO 02/016536 and WO 03/016496. [0113] In addition, as disclosed in these and other references, zinc finger domains and/or multi-fingered zinc finger proteins may be linked together using any suitable linker sequences, including for example, linkers of 5 or more amino acids in length. See, also, U.S. Patent Nos. 6,479,626; 6,903,185; and 7,153,949 for exemplary linker sequences 6 or more amino acids in length. The proteins described herein may include any combination of suitable linkers between the individual zinc fingers of the protein.

[0114] The CRISPPv (clustered regularly interspaced short palindromic repeats) locus, which encodes RNA components of the system, and the cas (CRISPR- associated) locus, which encodes proteins (Jansen et al., 2002. Mol. Microbiol. 43: 1565-1575; Makarova et al, 2002. Nucleic Acids Res. 30: 482-496; Makarova et al, 2006. Biol. Direct 1 : 7; Haft et al., 2005. PLoS Comput. Biol. 1 : e60) make up the gene sequences of the CRISPR/Cas nuclease system. CRISPR loci in microbial hosts contain a combination of CRISPR-associated (Cas) genes as well as non-coding RNA elements capable of programming the specificity of the CRISPR-mediated nucleic acid cleavage.

[0115] The Type II CRISPR is one of the most well characterized systems and carries out targeted DNA double-strand break in four sequential steps. First, two non- coding RNA, the pre-crRNA array and tracrRNA, are transcribed from the CRISPR locus. Second, tracrRNA hybridizes to the repeat regions of the pre-crRNA and mediates the processing of pre-crRNA into mature crRNAs containing individual spacer sequences. Third, the mature crRNA:tracrRNA complex directs Cas9 to the target DNA via Wastson-Crick base-pairing between the spacer on the crRNA and the protospacer on the target DNA next to the protospacer adjacent motif (PAM), an additional requirement for target recognition. Finally, Cas9 mediates cleavage of target DNA to create a double-stranded break within the protospacer. Activity of the CRISPR/Cas system comprises of three steps: (i) insertion of alien DNA sequences into the CRISPR array to prevent future attacks, in a process called 'adaptation', (ii) expression of the relevant proteins, as well as expression and processing of the array, followed by (iii) RNA-mediated interference with the alien nucleic acid. Thus, in the bacterial cell, several of the so-called 'Cas' proteins are involved with the natural function of the CRISPR/Cas system and serve roles in functions such as insertion of the alien DNA etc. [0116] In certain embodiments, Cas protein may be a "functional derivative" of a naturally occurring Cas protein. A "functional derivative" of a native sequence polypeptide is a compound having a qualitative biological property in common with a native sequence polypeptide. "Functional derivatives" include, but are not limited to, fragments of a native sequence and derivatives of a native sequence polypeptide and its fragments, provided that they have a biological activity in common with a corresponding native sequence polypeptide. A biological activity contemplated herein is the ability of the functional derivative to hydrolyze a DNA substrate into fragments. The term "derivative" encompasses both amino acid sequence variants of polypeptide, covalent modifications, and fusions thereof. Suitable derivatives of a Cas polypeptide or a fragment thereof include but are not limited to mutants, fusions, covalent modifications of Cas protein or a fragment thereof. Cas protein, which includes Cas protein or a fragment thereof, as well as derivatives of Cas protein or a fragment thereof, may be obtainable from a cell or synthesized chemically or by a combination of these two procedures. The cell may be a cell that naturally produces Cas protein, or a cell that naturally produces Cas protein and is genetically engineered to produce the endogenous Cas protein at a higher expression level or to produce a Cas protein from an exogenously introduced nucleic acid, which nucleic acid encodes a Cas that is same or different from the endogenous Cas. In some case, the cell does not naturally produce Cas protein and is genetically engineered to produce a Cas protein.

[0117] Thus, the nuclease comprises a DNA-binding domain in that specifically binds to a target site in any gene into which it is desired to insert a donor (transgene). B. Cleavage Domains

[0118] Any suitable cleavage domain can be operatively linked to a DNA- binding domain to form a nuclease. For example, ZFP DNA-binding domains have been fused to nuclease domains to create ZFNs - a functional entity that is able to recognize its intended nucleic acid target through its engineered (ZFP) DNA binding domain and cause the DNA to be cut near the ZFP binding site via the nuclease activity. See, e.g., Kim et al. (1996) Proc Natl Acad Sci USA 93(3): 1156-1160. More recently, ZFNs have been used for genome modification in a variety of organisms. See, for example, United States Patent Publications 20030232410; 20050208489; 20050026157; 20050064474; 20060188987; 20060063231 ; and International

Publication WO 07/014275. Likewise, TALE DNA-binding domains have been fused to nuclease domains to create TALENs. See, e.g., U.S. Publication No. 20110301073.

[0119] As noted above, the cleavage domain may be heterologous to the DNA-binding domain, for example a zinc finger DNA-binding domain and a cleavage domain from a nuclease or a TALEN DNA-binding domain and a cleavage domain, or meganuclease DNA-binding domain and cleavage domain from a different nuclease. Heterologous cleavage domains can be obtained from any endonuclease or exonuclease. Exemplary endonucleases from which a cleavage domain can be derived include, but are not limited to, restriction endonucleases and homing endonucleases. See, for example, 2002-2003 Catalogue, New England Biolabs, Beverly, MA; and Belfort et al. (1997) Nucleic Acids Res. 25:3379-3388. Additional enzymes which cleave DNA are known (e.g. , SI Nuclease; mung bean nuclease; pancreatic DNase I; micrococcal nuclease; yeast HO endonuclease; see also Linn et al. (eds.) Nucleases, Cold Spring Harbor Laboratory Press,1993). One or more of these enzymes (or functional fragments thereof) can be used as a source of cleavage domains and cleavage half-domains.

[0120] Similarly, a cleavage half-domain can be derived from any nuclease or portion thereof, as set forth above, that requires dimerization for cleavage activity. In general, two fusion proteins are required for cleavage if the fusion proteins comprise cleavage half-domains. Alternatively, a single protein comprising two cleavage half- domains can be used. The two cleavage half-domains can be derived from the same endonuclease (or functional fragments thereof), or each cleavage half-domain can be derived from a different endonuclease (or functional fragments thereof). In addition, the target sites for the two fusion proteins are preferably disposed, with respect to each other, such that binding of the two fusion proteins to their respective target sites places the cleavage half-domains in a spatial orientation to each other that allows the cleavage half-domains to form a functional cleavage domain, e.g., by dimerizing. Thus, in certain embodiments, the near edges of the target sites are separated by 5-8 nucleotides or by 15-18 nucleotides. However any integral number of nucleotides or nucleotide pairs can intervene between two target sites {e.g., from 2 to 50 nucleotide pairs or more). In general, the site of cleavage lies between the target sites. [0121] Restriction endonucleases (restriction enzymes) are present in many species and are capable of sequence-specific binding to DNA (at a recognition site), and cleaving DNA at or near the site of binding. Certain restriction enzymes (e.g., Type IIS) cleave DNA at sites removed from the recognition site and have separable binding and cleavage domains. For example, the Type IIS enzyme Fok I catalyzes double-stranded cleavage of DNA, at 9 nucleotides from its recognition site on one strand and 13 nucleotides from its recognition site on the other. See, for example, US Patents 5,356,802; 5,436,150 and 5,487,994; as well as Li et al. (1992) Proc. Natl. Acad. Sci. USA 89:4275-4279; Li et al. (1993) Proc. Natl. Acad. Sci. USA 90:2764- 2768; Kim et al. (1994a) Proc. Natl. Acad. Sci. USA 91:883-887; Kim et al. (1994b) J Biol. Chem. 269:31,978-31,982. Thus, in one embodiment, fusion proteins comprise the cleavage domain (or cleavage half-domain) from at least one Type IIS restriction enzyme and one or more zinc finger binding domains, which may or may not be engineered.

[0122] An exemplary Type IIS restriction enzyme, whose cleavage domain is separable from the binding domain, is Fok I. This particular enzyme is active as a dimer. Bitinaite et al. (1998) Proc. Natl. Acad. Sci. USA 95: 10,570-10,575.

Accordingly, for the purposes of the present disclosure, the portion of the Fok I enzyme used in the disclosed fusion proteins is considered a cleavage half-domain. Thus, for targeted double-stranded cleavage and/or targeted replacement of cellular sequences using zinc finger-i¾£ I fusions, two fusion proteins, each comprising a Fold cleavage half-domain, can be used to reconstitute a catalytically active cleavage domain. Alternatively, a single polypeptide molecule containing a zinc finger binding domain and two Fok I cleavage half-domains can also be used. Parameters for targeted cleavage and targeted sequence alteration using zinc fmger- oA: I fusions are provided elsewhere in this disclosure.

[0123] A cleavage domain or cleavage half-domain can be any portion of a protein that retains cleavage activity, or that retains the ability to multimerize (e.g., dimerize) to form a functional cleavage domain.

[0124] Exemplary Type IIS restriction enzymes are described in International

Publication WO 07/014275, incorporated herein in its entirety. Additional restriction enzymes also contain separable binding and cleavage domains, and these are contemplated by the present disclosure. See, for example, Roberts et al. (2003) Nucleic Acids Res. 31:418-420.

[0125] In certain embodiments, the cleavage domain comprises one or more engineered cleavage half-domain (also referred to as dimerization domain mutants) that minimize or prevent homodimerization, as described, for example, in U.S. Patent Publication Nos. 20050064474; 20060188987; 20070305346 and 20080131962, the disclosures of all of which are incorporated by reference in their entireties herein. Amino acid residues at positions 446, 447, 479, 483, 484, 486, 487, 490, 491, 496, 498, 499, 500, 531 , 534, 537, and 538 of Fok I are all targets for influencing dimerization of the Fok I cleavage half-domains.

[0126] Exemplary engineered cleavage half-domains of Fok I that form obligate heterodimers include a pair in which a first cleavage half-domain includes mutations at amino acid residues at positions 490 and 538 of Fok I and a second cleavage half-domain includes mutations at amino acid residues 486 and 499.

[0127] Thus, in one embodiment, a mutation at 490 replaces Glu (E) with Lys

(K); the mutation at 538 replaces Iso (I) with Lys (K); the mutation at 486 replaced Gin (Q) with Glu (E); and the mutation at position 499 replaces Iso (I) with Lys (K). Specifically, the engineered cleavage half-domains described herein were prepared by mutating positions 490 (E→K) and 538 (I→K) in one cleavage half-domain to produce an engineered cleavage half-domain designated "E490K:I538 " and by mutating positions 486 (Q→E) and 499 (I→L) in another cleavage half-domain to produce an engineered cleavage half-domain designated "Q486E:I499L". The engineered cleavage half-domains described herein are obligate heterodimer mutants in which aberrant cleavage is minimized or abolished. See, e.g., U.S. Patent

Publication No. 2008/0131962, the disclosure of which is incorporated by reference in its entirety for all purposes. In certain embodiments, the engineered cleavage half- domain comprises mutations at positions 486, 499 and 496 (numbered relative to wild-type Fokl), for instance mutations that replace the wild type Gin (Q) residue at position 486 with a Glu (E) residue, the wild type Iso (I) residue at position 499 with a Leu (L) residue and the wild-type Asn (N) residue at position 496 with an Asp (D) or Glu (E) residue (also referred to as a "ELD" and "ELE" domains, respectively). In other embodiments, the engineered cleavage half-domain comprises mutations at positions 490, 538 and 537 (numbered relative to wild-type Fokl), for instance mutations that replace the wild type Glu (E) residue at position 490 with a Lys (K) residue, the wild type Iso (I) residue at position 538 with a Lys (K) residue, and the wild-type His (H) residue at position 537 with a Lys (K) residue or a Arg (R) residue (also referred to as "KKK" and "KKR" domains, respectively). In other

embodiments, the engineered cleavage half-domain comprises mutations at positions 490 and 537 (numbered relative to wild-type Fokl), for instance mutations that replace the wild type Glu (E) residue at position 490 with a Lys (K) residue and the wild-type His (H) residue at position 537 with a Lys ( ) residue or a Arg (R) residue (also referred to as "KIK" and "KIR" domains, respectively). (See US Patent Publication No. 20110201055). In other embodiments, the engineered cleavage half domain comprises the "Sharkey" and/or "Sharkey' " mutations (see Guo et al, (2010) J. Mol. Biol. 400(1):96-107).

[0128] Engineered cleavage half-domains described herein can be prepared using any suitable method, for example, by site-directed mutagenesis of wild-type cleavage half-domains (Fok ί) as described in U.S. Patent Publication Nos.

20050064474; 20080131962; and 20110201055.

[0129] Alternatively, nucleases may be assembled in vivo at the nucleic acid target site using so-called "split-enzyme" technology (see e.g. U.S. Patent Publication No. 20090068164). Components of such split enzymes may be expressed either on separate expression constructs, or can be linked in one open reading frame where the individual components are separated, for example, by a self-cleaving 2A peptide or IRES sequence. Components may be individual zinc finger binding domains or domains of a meganuclease nucleic acid binding domain.

[0130] Nucleases can be screened for activity prior to use, for example in a yeast-based chromosomal system as described in WO 2009/042163 and

20090068164. Nuclease expression constructs can be readily designed using methods known in the art. See, e.g., United States Patent Publications 20030232410;

20050208489; 20050026157; 20050064474; 20060188987; 20060063231; and

International Publication WO 07/014275. Expression of the nuclease may be under the control of a constitutive promoter or an inducible promoter, for example the galactokinase promoter which is activated (de-repressed) in the presence of raffinose and/or galactose and repressed in presence of glucose. [0131] The Cas9 related CRISPR/Cas system comprises two RNA non-coding components: tracrRNA and a pre-crRNA array containing nuclease guide sequences (spacers) interspaced by identical direct repeats (DRs). To use a CRISPR/Cas system to accomplish genome engineering, both functions of these RNAs must be present (see Cong et a/, (2013) Sciencexpress 1/10.1126/science 1231143). In some embodiments, the tracrRNA and pre-crRNAs are supplied via separate expression constructs or as separate RNAs. In other embodiments, a chimeric RNA is constructed where an engineered mature crRNA (conferring target specificity) is fused to a tracrRNA (supplying interaction with the Cas9) to create a chimeric cr- RNA-tracrR A hybrid (also termed a single guide RNA). (see Jinek ibid and Cong, ibid).

Target Sites

[0132] As described in detail above, DNA domains can be engineered to bind to any sequence of choice. An engineered DNA-binding domain can have a novel binding specificity, compared to a naturally-occurring DNA-binding domain.

Engineering methods include, but are not limited to, rational design and various types of selection. Rational design includes, for example, using databases comprising triplet (or quadruplet) nucleotide sequences and individual zinc finger amino acid sequences, in which each triplet or quadruplet nucleotide sequence is associated with one or more amino acid sequences of zinc fingers which bind the particular triplet or quadruplet sequence. See, for example, co-owned U.S. Patents 6,453,242 and 6,534,261, incorporated by reference herein in their entireties. Rational design of TAL-effector domains can also be performed. See, e.g., U.S. Publication No.

20110301073.

[0133] Exemplary selection methods applicable to DNA-binding domains, including phage display and two-hybrid systems, are disclosed in US Patents 5,789,538; 5,925,523; 6,007,988; 6,013,453; 6,410,248; 6,140,466; 6,200,759; and 6,242,568; as well as WO 98/37186; WO 98/53057; WO 00/27878; WO 01/88197 and GB 2,338,237. In addition, enhancement of binding specificity for zinc finger binding domains has been described, for example, in co-owned WO 02/077227.

[0134] Selection of target sites; nucleases and methods for design and construction of fusion proteins (and polynucleotides encoding same) are known to those of skill in the art and described in detail in U.S. Patent Application Publication Nos. 20050064474 and 20060188987, incorporated by reference in their entireties herein.

[0135] In addition, as disclosed in these and other references, DNA-binding domains (e.g., multi-fingered zinc finger proteins) may be linked together using any suitable linker sequences, including for example, linkers of 5 or more amino acids. See, e.g., U.S. Patent Nos. 6,479,626; 6,903,185; and 7,153,949 for exemplary linker sequences 6 or more amino acids in length. The proteins described herein may include any combination of suitable linkers between the individual DNA-binding domains of the protein. See, also, U.S. Publication No. 20110301073.

[0136] As noted above, the DNA-binding domains of the nucleases may be targeted to any gene. In certain embodiments, the nuclease (DNA-binding domain component) is targeted to a "safe harbor" locus, which includes, by way of example only, the AAVS1 gene (see U.S. Patent No. 8,110,379), the CCR5 gene (see U.S. Publication No. 20080159996), the Rosa locus (see WO 2010/065123) and/or the albumin locus (see, U.S. Application No. 13/624,193).

Donors

[0137] Described herein are methods of targeted insertion of any

polynucleotides for insertion into a chosen location. Polynucleotides for insertion can also be referred to as "exogenous" polynucleotides, "donor" polynucleotides or molecules or "transgenes."

[0138] Surprisingly, it is demonstrated herein that double-stranded donor nucleotides (e.g., plasmids) without homology arms flanking the exogenous sequence (transgene) can be effectively integrated into a selected target region of the genome of cell following in vivo cleavage of the double-stranded donor. Thus, the double- stranded donors include one or more nuclease binding sites for cleavage of the donor in vivo (in the cell). In certain embodiments, the donor includes two nuclease binding sites. In methods in which targeted integration is achieved by making a double- stranded cut in the target region of the genome (see, e.g., U.S. Patent Nos. 7,888,121 ; 7,951,925; 8,110,379 and U.S. Patent Publication Nos. 20090263900; 20100129869 and 20110207221), one or more of then nucleases used to cleave the target region may also be used to cleave the donor molecule. [0139] In certain embodiments, the double-stranded donor includes sequences

(e.g., coding sequences, also referred to as transgenes) greater than 1 kb in length, for example between 2 and 200 kb, between 2 and 10 kb (or any value therebetween). The double-stranded donor also includes at least one nuclease target site, for example. In certain embodiments, the donor includes at least 2 target sites, for example for a pair of ZFNs or TALENs. Typically, the nuclease target sites are outside the transgene sequences, for example, 5' and/or 3' to the transgene sequences, for cleavage of the transgene. The nuclease cleavage site(s) may be for any nuclease(s). In certain embodiments, the nuclease target site(s) contained in the double-stranded donor are for the same nuclease(s) used to cleave the endogenous target into which the cleaved donor is integrated via homology-independent methods.

[0140] As noted above, the donor can be cleaved in vivo and integrated into the genome in a forward ("AB") or in a reverse ("BA") orientation. Targeted integration via in vivo donor cleavage that results in a perfectly ligated AB-orientation insertion will recreate the paired nuclease (e.g., ZFN or TALEN) binding sites with the original spacing between the sites. Such recreated sites are potential substrates for a second round of cleavage by the nucleases. Nuclease cleavage at the recreated sites could result in DNA deletion at the transgene-chromosome junctions (as a result of inaccurate NHEJ-based repair) or even transgene excision. In contrast, reverse (BA) orientation insertions result in formation of two different nuclease pair binding sites (e.g., homodimers of the left and right nucleases). If obligate heterodimer (EL/KK, ELD/KKR, etc.) Fokl nuclease domains are used, recreated BA sites will not be re- cleavable since the recreated binding sites are both homodimer sites. See, also, Figure 1C.

[0141] Furthermore, changing the nucleotides in the transgene donor nuclease spacer that make up the single-strand 5' overhang as compared to the wild-type (genomic) sequence, to the reverse complement of the wild-type sequence favors B A- orientation insertion of the cleaved donor (via Watson-Crick base-pairing with the overhangs on the cleaved chromosome) which would create an un-recleavable transgene integration (Figure 1C).

[0142] The transgenes carried on the donor sequences described herein may be isolated from plasmids, cells or other sources using standard techniques known in the art such as PGR. Donors for use can include varying types of topology, including circular supercoiled, circular relaxed, linear and the like. Alternatively, they may be chemically synthesized using standard oligonucleotide synthesis techniques. In addition, donors may be methylated or lack methylation. Donors may be in the form of bacterial or yeast artificial chromosomes (BACs or YACs).

[0143] The double-stranded donor polynucleotides described herein may include one or more non-natural bases and/or backbones. In particular, insertion of a donor molecule with methylated cytosines may be carried out using the methods described herein to achieve a state of transcriptional quiescence in a region of interest.

[0144] The exogenous (donor) polynucleotide may comprise any sequence of interest (exogenous sequence). Exemplary exogenous sequences include, but are not limited to any polypeptide coding sequence (e.g., cDNAs), promoter sequences, enhancer sequences, epitope tags, marker genes, cleavage enzyme recognition sites and various types of expression constructs. Marker genes include, but are not limited to, sequences encoding proteins that mediate antibiotic resistance (e.g., ampicillin resistance, neomycin resistance, G418 resistance, puromycin resistance), sequences encoding colored or fluorescent or luminescent proteins (e.g., green fluorescent protein, enhanced green fluorescent protein, red fluorescent protein, luciferase), and proteins which mediate enhanced cell growth and/or gene amplification (e.g., dihydrofolate reductase). Epitope tags include, for example, one or more copies of FLAG, His, myc, Tap, HA or any detectable amino acid sequence.

[0145] In a preferred embodiment, the exogenous sequence (transgene) comprises a polynucleotide encoding any polypeptide of which expression in the cell is desired, including, but not limited to antibodies, antigens, enzymes, receptors (cell surface or nuclear), hormones, lymphokines, cytokines, reporter polypeptides, growth factors, insect resistant, transcription factors and functional fragments of any of the above. The coding sequences may be, for example, cDNAs.

[0146] For example, the exogenous sequence may comprise a sequence encoding a polypeptide that is lacking or non-functional in the subject having a genetic disease, including but not limited to any of the following genetic diseases: achondroplasia, achromatopsia, acid maltase deficiency, adenosine deaminase deficiency (OMIM No.102700), adrenoleukodystrophy, aicardi syndrome, alpha- 1 antitrypsin deficiency, alpha-thalassemia, androgen insensitivity syndrome, apert syndrome, arrhythmogenic right ventricular, dysplasia, ataxia telangictasia, barth syndrome, beta-thalassemia, blue rubber bleb nevus syndrome, canavan disease, chronic granulomatous diseases (CGD), cri du chat syndrome, cystic fibrosis, dercum's disease, ectodermal dysplasia, fanconi anemia, fibrodysplasia ossificans progressive, fragile X syndrome, galactosemis, Gaucher' s disease, generalized gangliosidoses (e.g., GMl), hemochromatosis, the hemoglobin C mutation in the 6^th codon of beta-globin (HbC), hemophilia, Huntington's disease, Hurler Syndrome, hypophosphatasia, linefleter syndrome, Krabbes Disease, Langer-Giedion

Syndrome, leukocyte adhesion deficiency (LAD, OMIM No. 1 16920),

leukodystrophy, long QT syndrome, Marfan syndrome, Moebius syndrome, mucopolysaccharidosis (MPS), nail patella syndrome, nephrogenic diabetes insipdius, neurofibromatosis, Neimann-Pick disease, osteogenesis imperfecta, porphyria, Prader- Willi syndrome, progeria, Proteus syndrome, retinoblastoma, Rett syndrome,

Rubinstein-Taybi syndrome, Sanfilippo syndrome, severe combined

immunodeficiency (SCID), Shwachman syndrome, sickle cell disease (sickle cell anemia), Smith-Magenis syndrome, Stickler syndrome, Tay-Sachs disease,

Thrombocytopenia Absent Radius (TAR) syndrome, Treacher Collins syndrome, trisomy, tuberous sclerosis, Turner's syndrome, urea cycle disorder, von Hippel- Landau disease, Waardenburg syndrome, Williams syndrome, Wilson's disease, Wiskott-Aldrich syndrome, X-linked lymphoproliferative syndrome (XLP, OMIM No. 308240).

[0147] Additional exemplary diseases that can be treated by targeted integration include acquired immunodeficiencies, lysosomal storage diseases (e.g., Gaucher's disease, GMl , Fabry disease and Tay-Sachs disease),

mucopolysaccahidosis (e.g. Hunter's disease, Hurler's disease), hemoglobinopathies (e.g., sickle cell diseases, HbC, a-thalassemia, β-thalassemia) and hemophilias.

[0148] In certain embodiments, the exogenous sequences can comprise a marker gene (described above), allowing selection of cells that have undergone targeted integration, and a linked sequence encoding an additional functionality. Non-limiting examples of marker genes include GFP, drug selection marker(s) and the like.

[0149] Additional gene sequences that can be inserted may include, for example, wild-type genes to replace mutated sequences. For example, a wild-type Factor IX gene sequence may be inserted into the genome of a stem cell in which the endogenous copy of the gene is mutated. The wild-type copy may be inserted at the endogenous locus, or may alternatively be targeted to a safe harbor locus.

[0150] In some embodiments, the exogenous nucleic acid sequence

(transgene) comprising an agronomic gene or nucleotide sequence encoding a polypeptide of interest may include, for example and without limitation: a gene that confers resistance to a pests or disease (See, e.g., Jones et al. (1994) Science 266:789 (cloning of the tomato Cf-9 gene for resistance to Cladosporium fulvum); Martin et al. (1993) Science 262:1432; Mindrinos et al. (1994) Cell 78:1089 (RSP2 gene for resistance to Pseudomonas syringae); PCT International Patent Publication No. WO 96/30517 (resistance to soybean cyst nematode); PCT International Patent Publication No. WO 93/19181); a gene that encodes a Bacillus thuringiensis protein, a derivative thereof, or a synthetic polypeptide modeled thereon (See, e.g., Geiser et al. (1986) Gene 48:109 (cloning and nucleotide sequence of a Bt δ-endotoxin gene; moreover, DNA molecules encoding δ-endotoxin genes can be purchased from American Type Culture Collection (Manassas, VA), for example, under ATCC Accession Nos.

40098; 67136; 31995; and 31998)); a gene that encodes a lectin (See, e.g., Van Damme et al. (1994) Plant Molec. Biol. 24:25 (nucleotide sequences of several Clivia miniata mannose-binding lectin genes)); a gene that encodes a vitamin-binding protein, e.g., avidin (See PCT International Patent Publication No. US93/06487 (use of avidin and avidin homologues as larvicides against insect pests)); a gene that encodes an enzyme inhibitor, e.g., a protease, proteinase inhibitor, or amylase inhibitor (See, e.g., Abe et al. (1987) J. Biol. Chem. 262:16793 (nucleotide sequence of rice cysteine proteinase inhibitor); Huub et al. (1993) Plant Molec. Biol. 21 :985 (nucleotide sequence of cDNA encoding tobacco proteinase inhibitor I); Sumitani et al. (1993) Biosci. Biotech. Biochem. 57:1243 (nucleotide sequence of Streptomyces nitrosporeus alpha-amylase inhibitor) and U.S. Patent 5,494,813); a gene encoding an insect-specific hormone or pheromone, e.g. , an ecdysteroid or juvenile hormone, a variant thereof, a mimetic based thereon, or an antagonist or agonist thereof (See, e.g. , Hammock et al. (1990) Nature 344:458 (baculoviras expression of cloned juvenile hormone esterase, an inactivator of juvenile hormone)); a gene encoding an insect- specific peptide or neuropeptide that, upon expression, disrupts the physiology of the affected pest (See, e.g., Regan (1994) J. Biol. Chem. 269:9 (expression cloning yields DNA coding for insect diuretic hormone receptor); Pratt et al. (1989) Biochem. Biophys. Res. Comm. 163:1243 (an allostatin in Diploptera puntatd); and U.S. Patent 5,266,317 (genes encoding insect-specific, paralytic neurotoxins)); a gene encoding an insect-specific venom produced in nature by a snake, a wasp, or other organism {See, e.g., Pang et al. (1992) Gene 116:165 (heterologous expression in plants of a gene coding for a scorpion insectotoxic peptide)); a gene encoding an enzyme responsible for a hyperaccumulation of a monoterpene, a sesquiterpene, a steroid, hydroxamic acid, a phenylpropanoid derivative or other molecule with insecticidal activity; a gene encoding an enzyme involved in the modification, including the post- translational modification, of a biologically active molecule, e.g., a glycolytic enzyme, a proteolytic enzyme, a lipolytic enzyme, a nuclease, a cyclase, a

transaminase, an esterase, a hydrolase, a phosphatase, a kinase, a phosphorylase, a polymerase, an elastase, a chitinase, or a glucanase, whether natural or synthetic (See, e.g. , PCT International Patent Publication No. WO 93/02197 (nucleotide sequence of a callase gene); moreover, DNA molecules containing chitinase-encoding sequences can be obtained, for example, from the ATCC, under Accession Nos. 39637 and 67152; Kramer et al. (1993) Insect Biochem. Molec. Biol. 23:691 (nucleotide sequence of a cDNA encoding tobacco hornworm chitinase); and Kawalleck et al. (1993) Plant Molec. Biol. 21 :673 (nucleotide sequence of the parsley ubi4-2 polyubiquitin gene)); a gene encoding a molecule that stimulates signal transduction (See, e.g., Botella et al. (1994) Plant Molec. Biol. 24:757 (nucleotide sequences for mung bean calmodulin cDNA clones); and Griess et al. (1994) Plant Physiol.

104:1467 (nucleotide sequence of a maize calmodulin cDNA clone)); a gene that encodes a hydrophobic moment peptide (See, e.g., PCT International Patent

Publication No. WO 95/16776 (peptide derivatives of Tachyplesin which inhibit fungal plant pathogens); and PCT International Patent Publication No. WO 95/18855 (synthetic antimicrobial peptides that confer disease resistance)); a gene that encodes a membrane permease, a channel former, or a channel blocker (See, e.g., Jaynes et al. (1993) Plant Sci 89:43 (heterologous expression of a cecropin-β lytic peptide analog to render transgenic tobacco plants resistant to Pseudomonas solanacearum)); a gene that encodes a viral-invasive protein or complex toxin derived therefrom (See, e.g. , Beachy et al. (1990) Ann. rev. Phytopathol. 28:451); a gene that encodes an insect- specific antibody or immunotoxin derived therefrom (See, e.g., Taylor et al., Abstract #497, Seventh Int'l Symposium on Molecular Plant-Microbe Interactions (Edinburgh, Scotland) (1994) (enzymatic inactivation in transgenic tobacco via production of single-chain antibody fragments)); a gene encoding a virus-specific antibody (See, e.g., Tavladoraki et al. (1993) Nature 366:469 (transgenic plants expressing recombinant antibody genes are protected from virus attack)); a gene encoding a developmental-arrestive protein produced in nature by a pathogen or a parasite (See, e.g., Lamb et al. (1992) Bio/Technology 10:1436 (fungal endo <x-l,4-D- polygalacturonases facilitate fungal colonization and plant nutrient release by solubilizing plant cell wall homo-a-l,4-D-galacturonase); Toubart et al. (1992) Plant J. 2:367 (cloning and characterization of a gene which encodes a bean

endopolygalacturonase-inhibiting protein)); a gene encoding a developmental- arrestive protein produced in nature by a plant (See, e.g. , Logemann et al. (1992) Bio/Technology 10:305 (transgenic plants expressing the barley ribosome- inactivating gene have an increased resistance to fungal disease)).

[0151] In some embodiments, nucleic acids comprising an agronomic gene or nucleotide sequence encoding a polypeptide of interest may also and/or alternatively include, for example and without limitation: genes that confer resistance to an herbicide, such as an herbicide that inhibits the growing point or meristem, for example, an imidazolinone or a sulfonylurea (exemplary genes in this category encode mutant ALS and AHAS enzymes, as described, for example, by Lee et al. (1988) EMBO J. 7:1241, and Miki et al. (1990) Theor. Appl. Genet. 80:449, respectively); glyphosate resistance as conferred by, e.g., mutant 5- enolpyruvylshikimate-3 -phosphate synthase (EPSPs) genes (via the introduction of recombinant nucleic acids and/or various forms of in vivo mutagenesis of native EPSPs genes); aroA genes and glyphosate acetyl transferase (GAT) genes, respectively); other phosphono compounds, such as glufosinate phosphinothricin acetyl transferase (PAT) genes from Streptomyces species, including Streptomyces hygroscopicus and Streptomyces viridichromogenes); and pyridinoxy or phenoxy proprionic acids and cyclohexones (ACCase inhibitor-encoding genes). See, e.g. , U.S. Patents 4,940,835 and 6,248,876 (nucleotide sequences of forms of EPSPs which can confer glyphosate resistance to a plant). A DNA molecule encoding a mutant aroA gene can be obtained under ATCC accession number 39256. See also U.S. Pat. No. 4,769,061 (nucleotide sequence of a mutant aroA gene). European patent application No. 0 333 033 and U.S. Pat. No. 4,975,374 disclose nucleotide sequences of glutamine synthetase genes, which may confer resistance to herbicides such as L- phosphinothricin. Nucleotide sequences of exemplary PAT genes are provided in European application No. 0 242 246, and DeGreef et al. (1989) Bio/Technology 7:61 (production of transgenic plants that express chimeric bar genes coding for PAT activity). Exemplary of genes conferring resistance to phenoxy proprionic acids and cyclohexones, such as sethoxydim and haloxyfop, include the Accl-Sl, Accl-S2 and Accl-S3 genes described by Marshall et al. (1992) Theor. Appl. Genet. 83:435. GAT genes capable of conferring glyphosate resistance are described, for example, in WO 2005012515. Genes conferring resistance to 2,4-D, phenoxyproprionic acid and pyridyloxy auxin herbicides are described, for example, in WO 2005107437.

[0152] Nucleic acids comprising an agronomic gene or nucleotide sequence encoding a polypeptide of interest may also include, for example and without limitation: a gene conferring resistance to an herbicide that inhibits photosynthesis, such as a triazine (psbA and gs+ genes) or a benzonitrile (nitrilase gene). See, e.g., Przibila et al. (1991) Plant Cell 3:169 (transformation of Chlamydomonas with plasmids encoding mutant psbA genes). Nucleotide sequences for nitrilase genes are disclosed in U.S. Patent 4,810,648, and DNA molecules containing these genes are available under ATCC Accession No s. 53435; 67441 ; and 67442. See also Hayes et al. (1992) Biochem. J. 285:173 (cloning and expression of DNA coding for a glutathione S -transferase).

[0153] In some embodiments, nucleic acids comprising an agronomic gene or nucleotide sequence encoding a polypeptide of interest may also and/or alternatively include, genes that confer or contribute to a value-added trait, for example and without limitation: modified fatty acid metabolism, e.g., by transforming a plant with an antisense gene of stearyl-ACP desaturase to increase stearic acid content of the plant {See, e.g. , Knultzon et al. (1992) Proc. Natl. Acad. Sci. U.S.A. 89:2624);

decreased phytate content, e.g., introduction of a phytase-encoding gene may enhance breakdown of phytate, adding more free phosphate to the transformed plant (See, e.g., Van Hartingsveldt et al. (1993) Gene 127:87 (nucleotide sequence of an Aspergillus niger phytase gene); a gene may be introduced to reduce phytate content- in maize, for example, this may be accomplished by cloning and then reintroducing DNA associated with the single allele which may be responsible for maize mutants characterized by low levels of phytic acid (See Raboy et al. (1990) Maydica 35:383)); and modified carbohydrate composition effected, e.g. , by transforming plants with a gene encoding an enzyme that alters the branching pattern of starch (See, e.g., Shiroza et al. (1988) J. Bacteol. 170:810 (nucleotide sequence of Streptococcus mutant fructosyltransferase gene); Steinmetz et al. (1985) Mol. Gen. Genet. 20:220

(levansucrase gene); Pen et al. (1992) Bio/Technology 10:292 (a-amylase); Elliot et al. (1993) Plant Molec. Biol. 21 :515 (nucleotide sequences of tomato invertase genes); Sogaard et al. (1993) J. Biol. Chem. 268:22480 (barley a-amylase gene); and Fisher et al. (1993) Plant Physiol. 102: 1045 (maize endosperm starch branching enzyme II)).

[0154] Construction of such expression cassettes, following the teachings of the present specification, utilizes methodologies well known in the art of molecular biology (see, for example, Ausubel or Maniatis). Before use of the expression cassette to generate a transgenic animal, the responsiveness of the expression cassette to the stress-inducer associated with selected control elements can be tested by introducing the expression cassette into a suitable cell line (e.g., primary cells, transformed cells, or immortalized cell lines).

[0155] Furthermore, although not required for expression, exogenous sequences (transgenes) may also include transcriptional or translational regulatory sequences, for example, promoters, enhancers, insulators, internal ribosome entry sites, sequences encoding 2A peptides and/or polyadenylation signals. Further, the control elements of the genes of interest can be operably linked to reporter genes to create chimeric genes (e.g., reporter expression cassettes).

[0156] Targeted insertion of a transgene of non-coding nucleic acid sequence may also be achieved. Transgenes encoding antisense RNAs, RNAi, shRNAs and micro RNAs (miRNAs) may also be used for targeted insertions.

[0157] In additional embodiments, the donor nucleic acid may comprise non- coding sequences that are specific target sites for additional nuclease designs.

Subsequently, additional nucleases may be expressed in cells such that the original donor molecule is cleaved and modified by insertion of another donor molecule of interest. In this way, reiterative integrations of donor molecules may be generated allowing for trait stacking at a particular locus of interest or at a safe harbor locus. Methods for targeted transgene integration

[0158] The donor molecules disclosed herein are integrated into a genome of a cell via targeted, homology-independent methods. For such targeted integration, the genome is cleaved at a desired location (or locations) using a nuclease, for example, a fusion between a DNA-binding domain (e.g., zinc finger binding domain or TAL effector domain is engineered to bind a site at or near the predetermined cleavage site) and nuclease domain (e.g., cleavage domain or cleavage half-domain). In certain embodiments, two fusion proteins, each comprising a DNA-binding domain and a cleavage half-domain, are expressed in a cell, and bind to sites which are juxtaposed in such a way that a functional cleavage domain is reconstituted and DNA is cleaved in the vicinity of the target site(s). In one embodiment, cleavage occurs between the binding sites of the two DNA-binding domains. One or both of the DNA-binding domains can be engineered. See, also, U.S. Patent No. 7,888,121 ; U.S. Patent Publication 20050064474 and International Patent Publications WO05/084190, WO05/014791 and WO 03/080809.

[0159] The nucleases as described herein can be introduced as polypeptides and/or polynucleotides. For example, two polynucleotides, each comprising sequences encoding one of the aforementioned polypeptides, can be introduced into a cell, and when the polypeptides are expressed and each binds to its target sequence, cleavage occurs at or near the target sequence. Alternatively, a single polynucleotide comprising sequences encoding both fusion polypeptides is introduced into a cell. Polynucleotides can be DNA, RNA or any modified forms or analogues or DNA and/or RNA.

[0160] Following the introduction of a double-stranded break in the region of interest, the transgene is integrated into the region of interest in a targeted manner via non-homology dependent methods (e.g., non-homologous end joining (NHEJ)) following linearization of a double-stranded donor molecule as described herein. The double-stranded donor is preferably linearized in vivo with a nuclease, for example one or more of the same or different nucleases that are used to introduce the double- stranded break in the genome. Synchronized cleavage of the chromosome and the donor in the cell may limit donor DNA degradation (as compared to linearization of the donor molecule prior to introduction into the cell). The nuclease target site(s) used for linearization of the donor preferably do not disrupt the transgene(s) sequence(s). [0161] The transgene may be integrated into the genome in the direction expected by simple ligation of the nuclease overhangs (designated "forward" or "AB" orientation) or in the alternate direction (designated "reverse" or "BA" orientation). In certain embodiments, the transgene is integrated following accurate ligation of the donor and chromosome overhangs. In other embodiments, integration of the transgene in either the BA or AB orientation results in deletion of several nucleotides.

Delivery

[0162] The nucleases, polynucleotides encoding these nucleases, donor polynucleotides and compositions comprising the proteins and/or polynucleotides described herein may be delivered in vivo or ex vivo by any suitable means into any cell type.

[0163] Suitable cells include eukaryotic {e.g., animal or plant) and prokaryotic cells and/or cell lines. Non-limiting examples of such cells or cell lines generated from such cells include COS, CHO {e.g., CHO-S, CHO-K1, CHO-DG44, CHO- DUXB11, CHO-DUKX, CHOK1SV), VERO, MDCK, WI38, V79, B14AF28-G3, BHK, HaK, NS0, SP2/0-Agl4, HeLa, HEK293 {e.g., HEK293-F, HEK293-H, HEK293-T), and perC6 cells as well as insect cells such as Spodoptera fugiperda (Sf), or fungal cells such as Saccharomyces, Pichia and Schizosaccharomyces as well as plant cells from monocotyledonous or dicotyledonous plants including but not limited to maize, soybean, cotton, Arabidopsis, wheat, , barley, oats, sugar cane, sorghum, forage grasses, alfalfa, tomato, tobacco potato, rice, sunflower and Brassica. In certain embodiments, the cell line is a CHO, MDCK or HEK293 cell line. Suitable cells also include stem cells such as, by way of example, embryonic stem cells, induced pluripotent stem cells, hematopoietic stem cells, neuronal stem cells and

mesenchymal stem cells. In certain embodiments, the plant cells are but not limited to suspension culture, protoplasts, or organized tissues such as embryos, immature- embryos, leaf discs, cotyledons, hypotcols, and microspores. Methods of delivering nucleases as described herein are described, for example, in U.S. Patent Nos.

6,453,242; 6,503,717; 6,534,261 ; 6,599,692; 6,607,882; 6,689,558; 6,824,978;

6,933,113; 6,979,539; 7,013,219; and 7,163,824, the disclosures of all of which are incorporated by reference herein in their entireties. [0164] Nucleases and/or donor constructs as described herein may also be delivered using vectors containing sequences encoding one or more of the zinc finger protein(s). Any vector systems may be used including, but not limited to, plasmid vectors, retroviral vectors, lentiviral vectors, adenovirus vectors, poxvirus vectors; herpesvirus vectors and adeno-associated virus vectors, etc. See, also, U.S. Patent Nos. 6,534,261; 6,607,882; 6,824,978; 6,933,113; 6,979,539; 7,013,219; and 7,163,824, incorporated by reference herein in their entireties. Furthermore, it will be apparent that any of these vectors may comprise one or more of the sequences needed for treatment. Thus, when one or more nucleases and a donor construct are introduced into the cell, the nucleases and/or donor polynucleotide may be carried on the same vector or on different vectors. When multiple vectors are used, each vector may comprise a sequence encoding one or multiple nucleases and/or donor constructs.

[0165] Conventional viral and non-viral based gene transfer methods can be used to introduce nucleic acids encoding nucleases and donor constructs in cells (e.g., mammalian cells) and target tissues. Non- iral vector delivery systems include DNA plasmids, naked nucleic acid, and nucleic acid complexed with a delivery vehicle such as a liposome or poloxamer. Viral vector delivery systems include DNA and RNA viruses, which have either episomal or integrated genomes after delivery to the cell. For a review of in vivo delivery of engineered DNA-binding proteins and fusion proteins comprising these binding proteins, see, e.g., Rebar (2004) Expert Opinion

Invest. Drugs 13(7):829-839; Rossi et al. (2007) Nature Biotech. 25(12):1444-1454 as well as general gene delivery references such as Anderson, Science 256:808-813 (1992); Nabel & Feigner, TIBTECH 11 :211-217 (1993); Mitani & Caskey, TIBTECH 1 1 :162-166 (1993); Dillon, TIBTECH 1 1 : 167-175 (1993); Miller, Nature 357:455-460 (1992); Van Brunt, Biotechnology 6(10): 1149-1154 (1988); Vigne, Restorative

Neurology and Neuroscience 8:35-36 (1995); Kremer & Perricaudet, British Medical Bulletin 51 (1):31-44 (1995); Haddada et al., in Current Topics in Microbiology and Immunology Doerfler and Bohm (eds.) (1995); and Yu et al., Gene Therapy 1 : 13-26 (1994).

[0166] Methods of non- viral delivery of nucleic acids include electroporation, lipofection, microinjection, biolistics, virosomes, liposomes, immunoliposomes, polycation or lipidmucleic acid conjugates, naked DNA, artificial virions, and agent- enhanced uptake of DNA. Sonoporation using, e.g., the Sonitron 2000 system (Rich- Mar) can also be used for delivery of nucleic acids.

[0167] Additional exemplary nucleic acid delivery systems include those provided by Amaxa Biosystems (Cologne, Germany), Maxcyte, Inc. (Rockville, Maryland), BTX Molecular Delivery Systems (Holliston, MA) and Copernicus

Therapeutics Inc, (see for example US6008336). Lipofection is described in e.g., U.S. Patent Nos. 5,049,386; 4,946,787; and 4,897,355) and lipofection reagents are sold commercially (e.g., Transfectam™ and Lipofectin™). Cationic and neutral lipids that are suitable for efficient receptor-recognition lipofection of polynucleotides include those of Feigner, WO 91/17424, WO 91/16024.

[0168] The preparation of lipid ucleic acid complexes, including targeted liposomes such as immunolipid complexes, is well known to one of skill in the art (see, e.g., Crystal, Science 270:404-410 (1995); Blaese et al, Cancer Gene Ther. 2:291-297 (1995); Behr et al, Bioconjugate Chem. 5:382-389 (1994); Remy et al, Bioconjugate Chem. 5:647-654 (1994); Gao et al, Gene Therapy 2:710-722 (1995); Ahmad et al, Cancer Res. 52:4817-4820 (1992); U.S. Pat. Nos. 4,186,183, 4,217,344, 4,235,871 , 4,261,975, 4,485,054, 4,501,728, 4,774,085, 4,837,028, and 4,946,787).

[0169] Additional methods of delivery include the use of packaging the nucleic acids to be delivered into EnGeneIC delivery vehicles (EDVs). These EDVs are specifically delivered to target tissues using bispecific antibodies where one arm of the antibody has specificity for the target tissue and the other has specificity for the EDV. The antibody brings the EDVs to the target cell surface and then the EDV is brought into the cell by endocytosis. Once in the cell, the contents are released (see MacDiarmid et al (2009) Nature Biotechnology 27(7):643).

[0170] The use ofRNA or DNA viral based systems for the delivery of nucleic acids encoding engineered ZFPs take advantage of highly evolved processes for targeting a virus to specific cells in the body and trafficking the viral payload to the nucleus. Viral vectors can be administered directly to patients (in vivo) or they can be used to treat cells in vitro and the modified cells are administered to patients (ex vivo). Conventional viral based systems for the delivery of ZFPs include, but are not limited to, retroviral, lentivirus, adenoviral, adeno-associated, vaccinia and herpes simplex virus vectors for gene transfer. Integration in the host genome is possible with the retrovirus, lentivirus, and adeno-associated virus gene transfer methods, often resulting in long term expression of the inserted transgene. Additionally, high transduction efficiencies have been observed in many different cell types and target tissues.

[0171] The tropism of a retrovirus can be altered by incorporating foreign envelope proteins, expanding the potential target population of target cells. Lentiviral vectors are retroviral vectors that are able to transduce or infect non-dividing cells and typically produce high viral titers. Selection of a retroviral gene transfer system depends on the target tissue. Retroviral vectors are comprised of c s-acting long terminal repeats with packaging capacity for up to 6-10 kb of foreign sequence. The minimum cw-acting LTRs are sufficient for replication and packaging of the vectors, which are then used to integrate the therapeutic gene into the target cell to provide permanent transgene expression. Widely used retroviral vectors include those based upon murine leukemia virus (MuLV), gibbon ape leukemia virus (GaLV), Simian Immunodeficiency virus (SIV), human immunodeficiency virus (HIV), and combinations thereof {see, e.g., Buchscher et al, J. Virol. 66:2731-2739 (1992);

Johann et a/., J Virol. 66:1635-1640 (1992); Sommerfelt et al. , Virol. 176:58-59 (1990); Wilson et al, J. Virol. 63:2374-2378 (1989); Miller et al, J. Virol 65:2220- 2224 (1991); PCT US94/05700).

[0172] In applications in which transient expression is preferred, adenoviral based systems can be used. Adenoviral based vectors are capable of very high transduction efficiency in many cell types and do not require cell division. With such vectors, high titer and high levels of expression have been obtained. This vector can be produced in large quantities in a relatively simple system. Adeno-associated virus ("AAV") vectors are also used to transduce cells with target nucleic acids, e.g., in the in vitro production of nucleic acids and peptides, and for in vivo and ex vivo gene therapy procedures {see, e.g. , West et al, Virology 160:38-47 (1987); U.S. Patent No. 4,797,368; WO 93/24641 ; Kotin, Human Gene Therapy 5:793-801 (1994);

Muzyczka, J. Clin. Invest. 94:1351 (1994). Construction of recombinant AAV vectors are described in a number of publications, including U.S. Pat. No. 5,173,414; Tratschin et al. , Mol. Cell. Biol. 5:3251 -3260 (1985); Tratschin, et al. , Mol. Cell. Biol. 4:2072-2081 (1984); Hermonat & Muzyczka, PNAS 81 :6466-6470 (1984); and Samulski et a/., J Virol. 63 :03822-3828 (1989). [0173] At least six viral vector approaches are currently available for gene transfer in clinical trials, which utilize approaches that involve complementation of defective vectors by genes inserted into helper cell lines to generate the transducing agent.

[0174] pLASN and MFG-S are examples of retroviral vectors that have been used in clinical trials (Dunbar et al, Blood 85:3048-305 (1995); Kohn et al, Nat. Med. 1 :1017-102 (1995); Malech ei a/., PNAS 94:22 12133-12138 (1997)).

PA317/pLASN was the first therapeutic vector used in a gene therapy trial. (Blaese et al, Science 270:475-480 (1995)). Transduction efficiencies of 50% or greater have been observed for MFG-S packaged vectors. (Ellem et al, Immunol Immunother. 44(1): 10-20 (1997); Dranoff et al, Hum. Gene Ther. 1 :111-2 (1997).

[0175] Recombinant adeno-associated virus vectors (rAAV) are a promising alternative gene delivery systems based on the defective and nonpathogenic parvovirus adeno-associated type 2 virus. All vectors are derived from a plasmid that retains only the AAV 145 bp inverted terminal repeats flanking the transgene expression cassette. Efficient gene transfer and stable transgene delivery due to integration into the genomes of the transduced cell are key features for this vector system. (Wagner et al, Lancet 351 :9117 1702-3 (1998), Kearns et al, Gene Ther. 9:748-55 (1996)). Other AAV serotypes, including AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9 and AAVrh.10 and any novel AAV serotype can also be used in accordance with the present invention.

[0176] Replication-deficient recombinant adenoviral vectors (Ad) can be produced at high titer and readily infect a number of different cell types. Most adenovirus vectors are engineered such that a transgene replaces the Ad El a, Elb, and/or E3 genes; subsequently the replication defective vector is propagated in human 293 cells that supply deleted gene function in trans. Ad vectors can transduce multiple types of tissues in vivo, including nondividing, differentiated cells such as those found in liver, kidney and muscle. Conventional Ad vectors have a large carrying capacity. An example of the use of an Ad vector in a clinical trial involved polynucleotide therapy for antitumor immunization with intramuscular injection

(Sterman et al, Hum. Gene Ther. 7:1083-9 (1998)). Additional examples of the use of adenovirus vectors for gene transfer in clinical trials include Rosenecker et al, Infection 24:1 5-10 (1996); Sterman et al, Hum. Gene Ther. 9:7 1083-1089 (1998); Welsh et al, Hum. Gene Ther. 2:205-18 (1995); Alvarez et al, Hum. Gene Ther. 5:597-613 (1997); Topf et al, Gene Ther. 5:507-513 (1998); Sterman et al, Hum. Gene Ther. 7:1083-1089 (1998).

[0177] Packaging cells are used to form virus particles that are capable of infecting a host cell. Such cells include 293 cells, which package adenovirus, and ψ2 cells or PA317 cells, which package retrovirus. Viral vectors used in gene therapy are usually generated by a producer cell line that packages a nucleic acid vector into a viral particle. The vectors typically contain the minimal viral sequences required for packaging and subsequent integration into a host (if applicable), other viral sequences being replaced by an expression cassette encoding the protein to be expressed. The missing viral functions are supplied in trans by the packaging cell line. For example, AAV vectors used in gene therapy typically only possess inverted terminal repeat (ITR) sequences from the AAV genome which are required for packaging and integration into the host genome. Viral DNA is packaged in a cell line, which contains a helper plasmid encoding the other AAV genes, namely rep and cap, but lacking ITR sequences. The cell line is also infected with adenovirus as a helper. The helper virus promotes replication of the AAV vector and expression of AAV genes from the helper plasmid. The helper plasmid is not packaged in significant amounts due to a lack of ITR sequences. Contamination with adenovirus can be reduced by, e.g., heat treatment to which adenovirus is more sensitive than AAV.

[0178] In many gene therapy applications, it is desirable that the gene therapy vector be delivered with a high degree of specificity to a particular tissue type.

Accordingly, a viral vector can be modified to have specificity for a given cell type by expressing a ligand as a fusion protein with a viral coat protein on the outer surface of the virus. The ligand is chosen to have affinity for a receptor known to be present on the cell type of interest. For example, Han et al, Proc. Natl. Acad. Set USA 92:9747-^ 9751 (1995), reported that Moloney murine leukemia virus can be modified to express human heregulin fused to gp70, and the recombinant virus infects certain human breast cancer cells expressing human epidermal growth factor receptor. This principle can be extended to other virus-target cell pairs, in which the target cell expresses a receptor and the virus expresses a fusion protein comprising a ligand for the cell- surface receptor. For example, filamentous phage can be engineered to display antibody fragments (e.g., FAB or Fv) having specific binding affinity for virtually any chosen cellular receptor. Although the above description applies primarily to viral vectors, the same principles can be applied to nonviral vectors. Such vectors can be engineered to contain specific uptake sequences which favor uptake by specific target cells.

[0179] Gene therapy vectors can be delivered in vivo by administration to an individual patient, typically by systemic administration {e.g., intravenous,

intraperitoneal, intramuscular, subdermal, or intracranial infusion) or topical application, as described below. Alternatively, vectors can be delivered to cells ex vivo, such as cells explanted from an individual patient (e.g., lymphocytes, bone marrow aspirates, tissue biopsy) or universal donor hematopoietic stem cells, followed by reimplantation of the cells into a patient, usually after selection for cells which have incorporated the vector.

[0180] Vectors (e.g. , retroviruses, adenoviruses, liposomes, etc.) containing nucleases and/or donor constructs can also be administered directly to an organism for transduction of cells in vivo. Alternatively, naked DNA can be administered.

Administration is by any of the routes normally used for introducing a molecule into ultimate contact with blood or tissue cells including, but not limited to, injection, infusion, topical application and electroporation. Suitable methods of administering such nucleic acids are available and well known to those of skill in the art, and, although more than one route can be used to administer a particular composition, a particular route can often provide a more immediate and more effective reaction than another route.

[0181] Vectors suitable for introduction of polynucleotides (e.g. nuclease- encoding and/or double-stranded donors) described herein include non-integrating lentivirus vectors (IDLV). See, for example, Ory et al. (1996) Proc. Natl. Acad. Sci. USA 93: 11382-11388; Dull et al. (1998) J Virol. 72:8463-8471; Zuffery et a/.

(1998) J Virol. 72:9873-9880; Follenzi et al. (2000) Nature Genetics 25:217 '-222; U.S. Patent Publication No 2009/054985.

[0182] Pharmaceutically acceptable carriers are determined in part by the particular composition being administered, as well as by the particular method used to administer the composition. Accordingly, there is a wide variety of suitable formulations of pharmaceutical compositions available, as described below (see, e.g., Remington 's Pharmaceutical Sciences, 17th ed., 1989). [0183] It will be apparent that the nuclease-encoding sequences and donor constructs can be delivered using the same or different systems. For example, the nucleases and donors can be carried by the same vector. Alternatively, a donor polynucleotide can be carried by a plasmid, while the one or more nucleases can be carried by a AAV vector. Furthermore, the different vectors can be administered by the same or different routes (intramuscular injection, tail vein injection, other intravenous injection, intraperitoneal administration and/or intramuscular injection. The vectors can be delivered simultaneously or in any sequential order.

[0184] Thus, the instant disclosure includes in vivo or ex vivo treatment of diseases and conditions that are amenable to insertion of a transgenes encoding a therapeutic protein, for example treatment of hemophilias via nuclease-mediated integration of clotting factors such as Factor VIII (F8). The compositions are administered to a human patient in an amount effective to obtain the desired concentration of the therapeutic polypeptide in the serum or the target organ or cells. Administration can be by any means in which the polynucleotides are delivered to the desired target cells. For example, both in vivo and ex vivo methods are contemplated. Intravenous injection to the portal vein is a preferred method of administration. Other in vivo administration modes include, for example, direct injection into the lobes of the liver or the biliary duct and intravenous injection distal to the liver, including through the hepatic artery, direct injection in to the liver parenchyma, injection via the hepatic artery, and/or retrograde injection through the biliary tree. Ex vivo modes of administration include transduction in vitro of resected hepatocytes or other cells of the liver, followed by infusion of the transduced, resected hepatocytes back into the portal vasculature, liver parenchyma or biliary tree of the human patient, see e.g., Grossman et ah, (1994) Nature Genetics, 6:335-341.

[0185] The effective amount of nuclease(s) and donor to be administered will vary from patient to patient and according to the therapeutic polypeptide of interest. Accordingly, effective amounts are best determined by the physician administering the compositions and appropriate dosages can be determined readily by one of ordinary skill in the art. After allowing sufficient time for integration and expression (typically 4-15 days, for example), analysis of the serum or other tissue levels of the therapeutic polypeptide and comparison to the initial level prior to administration will determine whether the amount being administered is too low, within the right range or too high. Suitable regimes for initial and subsequent administrations are also variable, but are typified by an initial administration followed by subsequent administrations if necessary. Subsequent administrations may be administered at variable intervals, ranging from daily to annually to every several years. One of skill in the art will appreciate that appropriate immunosuppressive techniques may be recommended to avoid inhibition or blockage of transduction by immunosuppression of the delivery vectors, see e.g., Vilquin et al., (\995)Human Gene Ther., 6:1391-1401.

[0186] Formulations for both ex vivo and in vivo administrations include suspensions in liquid or emulsified liquids. The active ingredients often are mixed with excipients which are pharmaceutically acceptable and compatible with the active ingredient. Suitable excipients include, for example, water, saline, dextrose, glycerol, ethanol or the like, and combinations thereof. In addition, the composition may contain minor amounts of auxiliary substances, such as, wetting or emulsifying agents, pH buffering agents, stabilizing agents or other reagents that enhance the effectiveness of the pharmaceutical composition.

[0187] The delivery of nucleic acids may be introduced into a plant cell in embodiments of the invention by any method known to those of skill in the art, including, for example and without limitation: by transformation of protoplasts (See, e.g., U.S. Patent 5,508,184); by desiccation/inhibition-mediated DNA uptake (See, e.g., Potrykus et al. (1985) Mol. Gen. Genet. 199:183-8); by electroporation (See, e.g., U.S. Patent 5,384,253); by agitation with silicon carbide fibers (See, e.g., U.S. Patents 5,302,523 and 5,464,765); by Agrobacterium- ediated transformation (See, e.g., U.S. Patents 5,563,055, 5,591,616, 5,693,512, 5,824,877, 5,981,840, and 6,384,301);, by acceleration of DNA-coated particles (See, e.g., U.S. Patents 5,015,580, 5,550,318, 5,538,880, 6,160,208, 6,399,861, and 6,403,865) and by Nanoparticles, nanocarriers and cell penetrating peptides (WO201126644A2; WO2009046384A1;

WO2008148223 Al) in the methods to deliver DNA, RNA, Peptides and/or proteins or combinations of nucleic acids and peptides into plant cells.

[0188] Through the application of techniques such as these, the cells of virtually any species may be stably transformed. In some embodiments, transforming DNA is integrated into the genome of the host cell. In the case of multicellular species, transgenic cells may be regenerated into a transgenic organism. Any of these techniques may be used to produce a transgenic plant, for example, comprising one or more nucleic acid sequences of the invention in the genome of the transgenic plant.

[0189] The most widely-utilized method for introducing an expression vector into plants is based on the natural transformation system of Agrobacterium. A.

tumefaciens and A. rhizogenes are plant pathogenic soil bacteria that genetically transform plant cells. The T; and ¾ plasmids of A. tumefaciens and A. rhizogenes, respectively, carry genes responsible for genetic transformation of the plant. The T; (tumor-inducing)-plasmids contain a large segment, known as T-DNA, which is transferred to transformed plants. Another segment of the T, plasmid, the vir region, is responsible for T-DNA transfer. The T-DNA region is bordered by left-hand and right- hand borders that are each composed of terminal repeated nucleotide sequences. In some modified binary vectors, the tumor-inducing genes have been deleted, and the functions of the vir region are utilized to transfer foreign DNA bordered by the T-DNA border sequences. The T-region may also contain, for example, a selectable marker for efficient recovery of transgenic plants and cells, and a multiple cloning site for inserting sequences for transfer such as a nucleic acid encoding a fusion protein of the invention.

[0190] Thus, in some embodiments, a plant transformation vector is derived from a T; plasmid of A. tumefaciens {See, e.g., U.S. Patent Nos. 4,536,475, 4,693,977, 4,886,937, and 5,501,967; and European Patent EP 0 122 791) or a R; plasmid of A. rhizogenes. Additional plant transformation vectors include, for example and without limitation, those described by Herrera-Estrella et al. (1983) Nature 303:209-13; Bevan et al. (1983), supra; Klee et al. (1985) Bio/Technol. 3:637-42; and in European Patent EP 0 120 516, and those derived from any of the foregoing. Other bacteria, such as · Sinorhizobium, Rhizobium, and Mesorhizobium that naturally interact with plants can be modified to mediate gene transfer to a number of diverse plants. These plant-associated symbiotic bacteria can be made competent for gene transfer by acquisition of both a disarmed Tj plasmid and a suitable binary vector.

[0191] The following Examples relate to exemplary embodiments of the present disclosure in which the nuclease comprises a zinc finger nuclease (ZFN) or a TALEN. It will be appreciated that this is for purposes of exemplification only and that other nucleases can be used, for instance a CRISPR/Cas nuclease system or homing endonucleases (meganucleases) with engineered DNA-binding domains and/or fusions of naturally occurring of engineered homing endonucleases

(meganucleases) DNA-binding domains and heterologous cleavage domains.

EXAMPLES

Example 1: Materials and Methods

Cell growth, transfection, and ZFN/TALEN assay.

[0192] Transfection of K562 (ATCC CCL-243) used Amaxa Solution V and program T-016; CHO-K1 (ATCC CCl-61), Amaxa Solution T and program U-023. All transfections contained 10⁶ cells and the following plasmids: AAVS1, 3 μg of 2A- linked ZFNs and donor; IL2Ry and CCR5, 2 μg of each unlinked ZFN plasmid and 10 μg of the donor plasmid; GS, FUT8, 3 μg of 2A-linked ZFNs and 10 μg of donor plasmid. See, U.S. Patent No. 8,1 10,379 and U.S. Patent Publication Nos.

20100129869 and 20090042250 and DeKelver et al. (2010) Genome Res. 20(8):1133- 1142; Liu et al. (2009) Biotechnol. Bioeng. 106(1):97-105; Malphettes et al. (2010) Biotechnology and Bioengineering 106(5):774-83; Perez et al. (2008) Nature Biotech. 26(7):808-816; Urnov et al. (2005) Nature 435(7042):646-651 for further details including ZFN designs.

[0193] The FUT8 TALE nuclease pair (SBS 101082 and SBS 101086) directly overlaps the ZFN binding site in exon 10 of FUT8 and was constructed using the Δ152/+63 N- and C-terminal truncation points (Miller et al. 2011). The binding site of SBS 101082 FUT8 TALE is 5'-tgt ate tgg cca ctg at-3' (SEQ ID NO:l); SBS 101086, 5'- ttt gtc ttt gcc tec tt-3' (SEQ ID NO:2).

Donor plasmid design and construction

[0194] Oligos containing ZFN target sites for the AA VS1 (5 '-tgt ccc etc cAC

CCC ACA GTG Ggg cca cTA GGG ACA GGA Ttg gtg aca ga-3 SEQ ID NO:3), spaced-flipped AAVS1 (5 '-tgt ccc etc cAC CCC ACA GTG Ggt ggc cTA GGG ACA GGA Ttg gtg aca ga-3 ', SEQ ID NO:541), GS (5 '-gac cCC AAG CCC ATT CCT GGG Aac tgg aAT GGT GCA GGC Tgc cat acc aa-3 SEQ ID NO:4), and IL2Ry (5 '- gtt teg tgt tCG GAG CCG CTT Taa ccc ACT CTG TGG AAG tgc tea gca tt-3 ^', SEQ ID NO:5) ZFN pairs were annealed to their reverse complements in 50 mM NaCl, 10 HiM Tris pH 7.5, and 1 mM EDTA. See, also, U.S. Patent Publication Nos. 20100129869 and U.S. Patent Nos. 7,951,925 and 8,110,379. Capital letters denote the ZFN binding sites while lowercase letters denote flanking and spacer sequence.

[0195] The double-stranded products were then cloned into the EcoRV site of the pBluescript II KS- vector (Agilent). The CCR5 donor plasmid resulted from insertion of the CCR5 target site oligonucleotides (5 ' -GTC ATC CTC ATC CTG ATA AAC TGC AAA AGa-3', SEQ ID NO:6); 5-CTT TTG CAG TTT ATC AGG ATG AGG ATG ACa-3' SEQ ID NO:7, see, also, U.S. Patent No. 7,951,925) into pCR2.1 (Invitrogen). The second AAVS1 donor plasmid (see, Figure IE) was made by insertion of the above AAVS1 target site oligos into the EcoRV site of a pCR2.1 -based plasmid also containing the GFP open reading frame driven by the pGK promoter.

[0196] The FUT8 donor plasmid was made via insertion of the ZFN/TALEN binding site (5'- ggc CGT GTA TCT GGC CAC TGA TGA CCC TTC TTt gtt aAA GGA GGC AAA GAC AAA Gta a -3', SEQ ID NO:8) into a donor plasmid containing IgG and puromycin resistance transgenes (Moehle et al. (2007) Proc. Nat'l Acad. Sci. USA 104(9):3055-3060).

Assay of targeted integration

[0197] All PCR reactions were performed with 100 ng genomic DNA as a template, using Accuprime HiFi™ polymerase (Invitrogen). Genomic DNA was purified with the Masterpure™ kit (Epicentre).

[0198] Targeted integration of the AA VS1 GFP donor plasmid at AA VS1 (also known as PPP1R12C ) was assayed at all four possible chromosome-donor junctions via PCR amplification. PCR reactions used a 60° annealing temperature, a 30 second extension time, 30 cycles of amplification, and the following primers: AB left, AAVS1 CEL-I F (5'-ccc ctt acc tct eta gtc tgt gc-3', SEQ ID NO:9) and AAVS1 Junction R (5'-ggc gat taa gtt ggg taa cg-3', SEQ ID NO:10); AB right, AAVS1 Junction F (5'- ggc etc ttg gtc aag ttg tt-3', SEQ ID NO: l l) and AAVS1 CEL-I R (5' -etc agg ttc tgg gag agg gta g-3', SEQ ID NO: 12); BA left, AA VS1 CEL-I F and AA VS1 Junction F; BA right, AA VS1 Junction R and AA VS1 CEL-I R. For the sequences in Figure 8 A, the following primers were used: AB left, M13F (5 '-gta aaa cga egg cca gt-3', SEQ ID NO: 13) and AAVS1 CEL-F; BA right, M13F and AAVS1 CEL-I R.

[0199] Targeted integration at IL2Ry, CCR5, and GS was assayed via PCR amplification. PCR reactions used a 58° annealing temperature, a 30 second extension time, 5% DMSO, 26 cycles of amplification, and the M13F primer in combination with the following primers: IL2R/BA left, IL2Ry CEL-I F (5'-acc agt gag ttt tea tta gg-3', SEQ ID NO: 14); IL2R/AB right, IL2R CEL-I R (5'-tgg age aaa aga cag tgg tg-3', SEQ ID NO: 15); CCR5 BA left, R5F (5'-aag atg gat tat caa gtg tea agt cc-3', SEQ ID NO: 16); CCR5 AB right, R5R (5 ' -caa agt ccc act ggg cg-3 ' , SEQ ID NO : 17); GS BA left, GJC 172F (5 '-ate cgc atg gga gat cat ct-3', SEQ ID NO:18); GS AB right, GJC 173R (5'-gtg tat gtt cgt tea ccc ac-3', SEQ ID NO:19).

[0200] Targeted integration of the AA VS1 donor at GS (Figure 2B) was assayed via PCR amplification. PCR reactions used a 58°C annealing temperature, a 30 second extension time, 5% DMSO, 26 cycles of amplification, and the following primers: AB left, JcnlF (5 '-caa ata gga ccc tgt gaa gga-3', SEQ ID NO:20) and JcnlR (5 '-gat taa gtt ggg taa cgc cag-3', SEQ ID NO:21); BA left, Jcn3F (5'-aat agg acc ctg tga agg a-3', SEQ ID NO:22) and Jcn3R (5'-gtg tgg aat tgt gag egg ata-3', SEQ ID NO:23).

[0201] Targeted integration of the IgG donor at FUT8 (Figure 4) was assayed via PCR amplification. PCR reaction used a 60° annealing temperature, a 30 second extension time, 30 cycles of amplification (35 cycles for screening of crude lysates), and the following primers: AB left junction, GJC 75F (5'- agt cca tgt cag acg cac tg- 3', SEQ ID NO:24) and SC seqpfzR (5'- aga gtg agg etc tgt etc aa-3', SEQ ID

NO:25); AB right junction, FUT8 donor CELIF2 (5'- tac gta tag get gcg caa ct-3', SEQ ID NO:26) and GJC 115R (5'- gca cat gta gtc ttt gat ttt g-3', SEQ ID NO:27); BA left junction, GJC75F and FUT8 donor CELIF2; BA right junction, SC seqpfzR and GJC115R.

[0202] The Southern blot of AA VS1 GFP donor integration at AA VS1 was probed as previously described (DeKelver et al. 2010, ibid). Expected results from this Southern blot are as follows: an AAVSl probe will hybridize to either a 2092 or 6592 bp band for the AB and BA orientations, respectively. The wild-type, triploid AAVSl locus will be seen as a 3287 bp band. The Southern blot of AAVSl GFP donor integration elsewhere in the genome was probed with the complete open reading frame of GFP. Integration at AAVSl will produce either 3323 or 4482 bp bands for the AB and BA orientations, respectively; non-targeted integrations elsewhere in the genome will produce secondary bands of indeterminable size. The Southern blot of GS donor integration at GS was probed with a 424 bp fragment of the GS gene bounded by 5'-ctg cag gtg aag aca gga tg-3' and 5'-ccc act aga aag aac atg tt-3'.

Integration at GS will be revealed as a hybridizing band at 2933 bp; the wild-type GS locus will produce a 1977 bp band. The Southern blot of GS donor integration elsewhere in the genome was probed with a Bsal - Seal fragment of the E. coli bla gene. Correctly integrated transgenes will give a 2055 bp band; integrations into the GS pseudogenes will give bands of 4878, 4214, 10080, and 9416 bp depending on the pseudogene and insert orientation; other non-targeted integrations will produce a single band of unpredictable size. The Southern blot of integration at FUT8 was probed with a 407 bp Hindlll-Xmnl fragment of the FUT8 locus. For both FUT8 Southerns, the genomic DNA was cut with Hindlll.

[0203] Contigs containing FUT8, GS, and GS pseudogenes were extracted from the whole-CHO genome sequencing data using a custom Python script (Xu et al. (2011) Nat Biotechnol 29(8):735-41). FUT8 is present on contig AFTD01065932.1 ; GS on contig AFTD01107178.1. One GS pseudogene (contained in

AFTD01043599.1) has perfect conservation of the ZFN binding sites, 120/128 (94%) bp of homology to the exon 5 portion of the probe, and is expected to be present in a 6333 bp Seal fragment. The second GS pseudogene (contained in AFTD01154859.1) has one mismatch in the ZFN binding sites, 116/128 (91%) bp of homology to the exon 5 portion of the probe and is expected to be present in a 13320 bp Seal fragment.

[0204] Antibody concentrations were measured using the Pierce Easy- Titer

IgG Assay Kit (23310) according to the manufacturer's instructions. Clones with at least two-fold higher than background were classified as positive.

Example 2: Targeted integration following in vivo cleavage of a double-stranded donor

A. AA VS1

[0205] To test whether transgene cleavage in vivo using the same ZFN that cuts the genomic target site would synchronize donor and chromosome cleavage, minimizing the vulnerability of the transgene to degradation, K562 cells were transfected with AA VS1 -targeted ZFNs and a donor plasmid that includes the AA VS1 ZFN target sites for cleavage of the donor plasmid in vivo. Briefly, as described in Example 1, we cloned the recognition site for the well-characterized and highly active AA VS1 ZFNs into a donor plasmid containing an autonomous GFP expression cassette but lacking homology to the AA VS1 locus. See, U.S. Patent No. 8, 110,379 and DeKelver et al. (2010) Genome Res. 20(8): 1133-1142. The donor plasmid (with or without the ZFN target site) was co-transfected into K562 cells along with a second plasmid encoding the AA VS1 ZFNs.

[0206] Insertion into the chromosomal AA VS1 site assayed by PCR amplification of the unique junctions formed by targeted donor integration from genomic DNA isolated 3 days post-transfection as described above in Example 1.

[0207] As shown in Figure 1, when co-transfected with the cognate ZFNs, simultaneous cleavage of both a ZFN site-containing donor plasmid and the chromosome will occur, allowing insertion of the plasmid into the chromosome.

Insertion of the donor plasmid in the direction expected by simple ligation of the ZFN overhangs was designated as the AB orientation, the alternate direction was designated as the BA orientation. As shown in Figure ID, the BA (reverse) orientation is favored when the nucleotides between the target sites (spacer) is the reverse complement of the genomic (wild-type) sequence.

[0208] Furthermore, as shown in Figures ID and IE (see lanes 6, 8, 10, 17, 19 and 21 of IE), consistent with successful capture of the cleaved donor DNA, we detected the expected 5' and 3' junctions formed by donor integration in both the AB and the BA orientations. The BA orientation was favored with the reverse complement spacers (Figure ID). Donor integration required ZFN-mediated cleavage as both (i) donor without an AAVS1 ZFN site was not integrated despite efficient cleavage of the AAVS1 locus (see, Figure IE, lanes 5, 7, 9, and lanes 16, 18, and 20; Figure 5A) and (ii) transfection of a donor without co-transfection of the

corresponding ZFN also failed to yield targeted integration.

[0209] Cell clones were obtained by limiting dilution from the pool transfected with both ZFN and donor (lane 8/19). Three GFP -positive and junction PCR-positive clones were analyzed in duplicate by Southern blot to confirm integration of the donor plasmid. The clones fall into three classes: clone one contains one AB insertion; clone two contains one BA insertion; clone three contains both AB and BA insertions in addition to a non-inserted allele (Figure IF).

[0210] The three clones were also analyzed for off-target integration by

Southern blotting with a GFP-specific probe. Clone one contains only the expected insertion at AA VS1 whereas clones 2 and 3 contain a transgene insertion elsewhere in the genome in addition to the AAVS1 insertions (Figure IF).

[0211] PCR amplicons of the chromosome-donor integration junctions from these three cell lines were cloned and sequenced. As shown in Figure 7, clones 1 and 3 contained AB insertions with perfect ligation of the donor and chromosomal overhangs at both the 5' and 3' junctions. Clones 2 and 3 contained BA insertions with alleles produced by microhomology-driven repair at the left, 5' junction.

B. IL2Ry, GS and CCR

[0212] To demonstrate that capture of a cleaved donor was not restricted to integration into AAVS1 in K562 cells, we performed analogous experiments at three other loci (IL2Ry, CCR and GS) in K562 and CHO cells. Successful targeted integration was monitored at one chromosome-donor junction for each orientation (AB and BA) as described above.

[0213] Site-specific integration targeted to the site of ZFN cleavage was observed for the IL2R/, and CCR5 loci in K562 cells and for the GS locus in CHO-Kl cells. (Figure 2A, lanes 3, 7, 11, 15, 19, and 23). As with integration to AAVS1, integration at IL2R , CCR5, and GS was dependent upon inclusion of the ZFN cleavage site in the donor plasmid (Figure 2 A, lanes 1, 5, 9, 13, 17, and 21) and the co-delivery of the ZFNs themselves. ZFN activity, both at the chromosomal target and on the donor plasmid was essentially uniform across all samples (Figure 5B).

[0214] Sequencing of chromosome-donor junction PCR products from these loci, as well as from an analogous pool of AA VS1 integrants, revealed a spectrum of insertion events consistent with correct integration at the targeted locus (Figure 8).

[0215] Thus, the ability to capture an in vivo cleaved transgene donor at a

DSB is a general property of the mammalian DNA repair machinery and is independent of the specific target site or cell type.

Example 3: In vivo and in vitro cleavage

[0216] To confirm that in vivo cleavage was necessary to support the observed levels of targeted gene insertion, we performed a direct comparison of targeted integration using in vivo cleaved donors and donors cleaved in vitro using EcoRV, as described in Example 1. [0217] As shown in Figure 2, while integration of pre -cleaved donor plasmids was occasionally detectable, it was markedly less efficient compared to the in vivo- cleaved donors (Figure 2A, compare lanes 2/3, 6/7, 10/11, 14/15, 18/19, and 22/23). Moreover, the use of pre-cleaved donor DNAs showed an increased range of junction PCR sizes consistent with an increased level of donor DNA degradation prior to chromosomal capture (see, e.g., Figure 2A, lane 22).

[0218] To confirm targeted integration could be stimulated via the use of two different nucleases (ZFNs), we used the GS ZFN pair (Example 1) to cut the chromosome of CHO- 1 cells and the AAVS1 ZFN pair to cleave a donor plasmid in the same cell. Integration at GS was detected at a similar frequency both when the GS ZFN pair cut the chromosome and the donor (as in Figure 2 A) and when the GS ZFNs cut the chromosome while the AAVS1 ZFNs cut the donor (Figure 2B, lanes 6 and 8, lanes 15 and 17). Cleavage efficiency at GS was again uniform over all GS ZFN- transfected samples (Figure 5C).

[0219] Thus, in vivo cleavage is more efficient than pre-cleavage of the donor molecule.

Example 4: Homology-independent targeted integration into CHO cells

[0220] Targeted integration in CHO cells has particularly important applications in biotechnology yet CHO cells perform HDR-based targeted integration of several kilobase transgenes very poorly. To highlight this point, we compared HDR-mediated targeted integration in both HEK-293 cells and CHO-K1 cells using a system designed to deliver a promoterless GFP gene into a promoter-containing acceptor locus, essentially as described in Moehle et al. (2007) Proc. Nat Ί Acad. Sci. USA 104(9):3055-3060). Targeted integration results in expression of GFP and allows quantitation by flow cytometry.

[0221] When transfected with ZFNs and a homology-containing donor plasmid (for integration via HDR), between 0.5 and 3% of HEK-293 cells became GFP-positive (Figure 6). In contrast, none of the CHO-K1 cells became GFP-positive when similarly transfected. Given that CHO cells perform HDR-based targeted integration poorly, and yet have proven their utility for recombinant protein production, we next asked whether in vivo cleavage of donor DNA could be exploited to drive targeted integration in CHO cells. [0222] CHO-K1 cells from the pool bearing targeted integration at GS (Figure

2 A, lanes 19/23) were cloned by limiting dilution and single-cell derived clones screened by PCR for site-specific integration. In contrast to the negative results obtained with the HDR-based approach, homology-independent capture of the in vivo cleaved donor DNA yielded 10% (17/157) single-cell derived clones that were PCR- positive for the left chromosome-donor BA junction, 8% (13) positive for the right BA junction, and 6% (10) positive for both BA junctions. Eight of these ten clones were chosen randomly for analysis by Southern blotting. All 8 clones contained the expected targeted transgene insertion at the GS target site, a wild-type GS allele, and two GS pseudogenes (Figure 3). Only the wild-type GS allele and the pseudogenes are present in wild-type CHO-K1 cells (Figure 3, lane 9). Furthermore, when probed with a transgene-specific sequence, five of the 8 clones were shown to contain only one copy of the transgene at GS, whereas 3 contained a transgene copy at GS along with one or more randomly integrated copies at other sites in the CHO genome, one of which corresponded to integration into a GS pseudogene (Figure 3, lanes 10, 12, 13, 16, 17 and lanes 11, 14, 15, respectively). The chromosome-transgene junctions were sequenced and are shown in Figure 9.

Example 5: Targeted integration into and disruption of FUT8

[0223] Transgenes are routinely inserted into the CHO cell genome to produce biopharmaceutical proteins, notably antibodies. CHO cells with a deletion of the FUT8 gene yield fucosylated antibodies with 100-fold higher antibody-dependent cellular cytotoxicity (Malphettes et al. (2010) Biotechnology and Bioengineering 106(5):774-83; Yamane-Ohnuki et al. (2004) Biotech. Bioeng. 87(5):614-22).

Moreover, knockout of FUT8 expression can be selected for, thus potentially coupling targeted integration with this selectable trait. We therefore used the previously described LTS-specific ZFNs to disrupt the FUT8 gene via insertion of an in vivo- cleaved antibody production cassette (Moehle et al. (2007) Proc. Nat Ί Acad. Sci. USA 104(9):3055-3060). Furthermore, we wished to determine whether capture of an in vivo cleaved donor could occur at a double-strand break produced by a TALE nuclease (TALEN) specific for FUT8.

[0224] Briefly, ZFNs or TALENs that cleave FUT8 were cotransfected with an antibody expression plasmid containing a FUT8 nuclease cleavage site as described in Example 1. The transfected pool was selected for biallelic FUT8 knockout using Lens culinaris agglutinin and cells cloned by limiting dilution (see, Malphettes et al. (2010) Biotechnology and Bioengineering 106(5):774-83). Clones were screened for secretion of IgG and for insertion of the IgG transgene by PCR of both left and right transgene/ chromo some j unctions .

[0225] As shown in Figure 4, ZFN-treated clones, 25/96 (26%) of clones expressed IgG and 14/96 (15%) of clones were positive for insertion of the complete IgG transgene. All but one clone positive for insertion by PCR expressed IgG.

Similar results were obtained with the FUT8 TALENs: 35/171 (20%) of clones expressed IgG and all 16 (9%) of clones with complete transgene insertion expressed IgG. Clones with one (but not both) transgene integration junctions detectable by PCR accounted for a significant fraction of the remaining IgG-expressing clones (Figure 4A).

[0226] These experiments were also performed using TALE-nucleases targeted to FUT8. Transgene integration at FUT8 was confirmed by Southern blot analysis for ten PCR- and IgG-positive clones derived from TALEN-mediated transgene insertion (Figures 4B and 4C). In addition, Figure 12 shows sequences obtained by sequencing of PCR junctions of L S-integrated donors using ZFNs (Figure 10A) or TALENs (Figure 10B).

[0227] In sum, the methods and compositions described herein provide for the facile and targeted integration of large transgenes via homology-independent methods, including in cell lines (e.g., CHO cells) that are resistant to homology-driven integration. Example 6: Targeted integration into and disruption of wheat AHAS loci Characterization and identification of AHAS genomic target sequences

[0228] The transcribed regions for three homoeologous AHAS genes were identified and determined, zinc finger nucleases were designed to bind and cleave the sites for NHEJ-mediated targeting of a donor sequence as described in U.S.

Provisional Patent Filing No. 61/809,097, incorporated herein by reference. These novel sequences are listed as SEQ ID NO: 116, SEQ ID NO: 117, and SEQ ID

NO:l 18. Previous sequencing efforts identified and genetically mapped

homoeologous copies of AHAS genes from Triticum aestivum to the long arms of chromosomes 6A, 6B and 6D (Anderson et al., (2004) Weed Science 52:83-90; and, Li et al., (2008) Molecular Breeding 22:217-225). Sequence analysis of Expressed Sequence Tags (EST) and genomic sequences available in Genbank (Accession Numbers: AY210405.1, AY210407.1, AY210406.1, AY210408.1, FJ997628.1, FJ997629.1, FJ997631.1, FJ997630.1, FJ997627.1, and AY273827.1) were used to determine the transcribed region for the homoeologous copies of the AHAS gene (SEQ ID NOs: 116-118).

[0229] The novel, non-coding AHAS gene sequences located upstream and downstream of the transcribed region were characterized for the first time. To completely characterize theses non-coding sequences, the transcribed sequences for each of the three homoeologous copies of the AHAS gene were used as BLASTN™ queries to screen unassembled ROCHE 454™ sequence reads that had been generated from whole genome shotgun sequencing of Triticum aestivum cv. Chinese Spring. The ROCHE 454™ sequence reads of Triticum aestivum cv. Chinese Spring had been generated to 5-fold sequence coverage. Sequence assembly was completed using the SEQUENCHER SOFTWARE™ (GeneCodes, Ann Arbor, MI) of the ROCHE 454™. Sequence reads with a significant BLASTN™ hit (E-value O.0001) were used to characterize these non-transcribed region. Iterative rounds of BLASTN™ analysis and sequence assembly were performed. Each iteration incorporated the assembled AHAS sequence from the previous iteration so that all of the sequences were compiled as a single contiguous sequence. Overall, 4,384, 7,590 and 6,205 of genomic sequences for the homoeologous AHAS genes located on chromosomes 6A, 6B and 6D, respectively, were characterized (SEQ ID NOs: 119-121). Sequence analysis of AHAS genes isolated from Triticum aestivum cv. Bobwhite MPB26RH

[0230] The homoeologous copies of the AHAS gene were cloned and sequenced from Triticum aestivum cv. Bobwhite MPB26RH to obtain nucleotide sequence suitable for designing specific zinc finger proteins that could bind the sequences with a high degree of specificity. The sequence analysis of the AHAS nucleotide sequences obtained from Triticum aestivum cv. Bobwhite MPB26RH was required to confirm the annotation of nucleotides present in Genbank and the

ROCHE 454™ AHAS gene sequences and due to allelic variation between cv. Bobwhite MPB26RH and the other wheat varieties from which the Genbank and ROCHE 454™ sequences were obtained.

[0231] A cohort of PCR primers were designed for amplification of the AHAS genes (Table 1). The primers were designed from a consensus sequence which was produced from multiple sequence alignments generated using CLUSTALW™

(Thompson et ah, (1994) Nucleic Acids Research 22:4673-80). The sequence alignments were assembled from the cv. Chinese Spring sequencing data generated from ROCHE 454™ sequencing which was completed at a 5-fold coverage.

[0232] As indicated in Table 1, the PCR primers were designed to amplify all three homoeologous sequences or to amplify only a single homoeologous sequence. For example, the PCR primers used to amplify the transcribed region of the AHAS gene were designed to simultaneously amplify all three homoeologous copies in a single multiplex PCR reaction. The PCR primers used to amplify the non-transcribed region were either designed to amplify all three homoeologous copies or to amplify only a single homoeologous copy. All of the PCR primers were designed to be between 18 and 27 nucleotides in length and to have a melting temperature of 60 to 65°C, optimal 63°C. In addition, several primers were designed to position the penultimate base (which contained a phosphorothioate linkage and is indicated in Table 1 as an asterisk [*]) over a nucleotide sequence variation that distinguished the gene copies from each wheat sub-genome. Table 1 lists the PCR primers that were designed and synthesized.

Table 1 : Primer se uences used for PCR am lification of AHAS sequences

AHAS_lRl_trans Coding A, B, and 129 GGG TCG TCR CTG GGG cribed D AAG TT

AHAS_2F2_trans Coding A, B, and 130 GCC TTC TTC CTY GCR cribed D TCC TCT GG

AHAS_2R2_trans Coding A, B, and 131 GCC CGR TTG GCC TTG cribed D TAA AAC CT

AHAS_3Fl_trans Coding A, B, and 132 AYC AGA TGT GGG cribed D CGG CTC AGT AT

AHAS_3Rl_trans Coding A, B, and 133 GGG ATA TGT AGG cribed D ACA AGA AAC TTG

CAT GA

AHAS- 3'UTR A 134 AGGGCCATACTTGTTG 6A.PS.3'.F1 GATATCAT*C

AHAS- 3'UTR A 135 GCCAACACCCTACACT 6A.PS.3'.R2 GCCTA*T

AHAS- 3'UTR B 136 TGCGCAATCAGCATGA 6B.PS.3'.F1 TACC*T

AHAS- 3'UTR B 137 ACGTATCCGCAGTCGA 6B.PS.3'.R1 GCAA*T

AHAS- 3'UTR D 138 GTAGGGATGTGCTGTC 6D.PS.3'.F1 ATAAGAT*G

AHAS- 3'UTR D 139 TTGGAGGCTCAGCCGA 6D.PS.3'.R3 TCA*C

UTR = untranslated region

Coding = primers designed for the transcribed regions

asterisk (*) indicates the incorporation of a phosphorothioate sequence [0233] Sub-genome-specific amplification was achieved using on-off PCR

(Yang et ah, (2005) Biochemical and Biophysical Research Communications 328:265-72) with primers that were designed to position the penultimate base (which contained a phosphorothioate linkage) over a nucleotide sequence variation that distinguished the gene copies from each wheat sub-genome. Two different sets of PCR conditions were used to amplify the homoeologous copies of the AHAS gene from cv. Bobwhite MPB26RH. For the transcribed regions, the PCR reaction contained 0.2 mM dNTPs, IX IMMOLASE PCR™ buffer (Bioline, Taunton, MA), 1.5 mM MgCl₂, 0.25 units IMMOLASE DNA POLYMERASE™ (Bioline, Taunton, MA), 0.2 μΜ each of forward and reverse primer, and about 50 ng genomic DNA. Reactions containing the AHAS_1F1 and AHAS_1R1 primers were supplemented with 8% (v/v) DMSO. For the non-transcribed regions, the PCR reactions contained 0.2 mM dNTP, IX PHUSION GC BUFFER™ (New England Biolabs Ipswich, MA), 0.5 units HOT-START PHUSION DNA™ polymerase (New England Biolabs), 0.2 μΜ each of forward and reverse primer, and about 50 ng genomic DNA. PCR was performed in a final 25 μΐ reaction volume using an MJ PTC200® thermocycler (BioRad, Hercules, CA). Following PCR cycling, the reaction products were purified and cloned using PGEM-T EASY VECTOR™ (Promega, Madison, WI) into E. coli JM109 cells. Plasmid DNA was extracted using a DNAEASY PLASMID DNA PURIFICATION KIT™ (Qiagen, Valencia, CA) and Sanger sequenced using

BIGDYE® v3.1 chemistry (Applied Biosystems, Carlsbad, CA) on an ABI3730XL® automated capillary electrophoresis platform. Sequence analysis performed using SEQUENCHER SOFTWARE™ (GeneCodes, Ann Arbor, MI) was used to generate a consensus sequence for each homoeologous gene copy (SEQ ID NO: 140, SEQ ID NO:141, and SEQ ID NO:142) from cv. Bobwhite MPB26RH. CLUSTALW™ was used to produce a multiple consensus sequence alignment from which homoeologous sequence variation distinguishing between the AHAS gene copies was confirmed.

Design of Zinc Finger Binding domains specific to AHAS gene sequences

[0234] Zinc finger proteins directed against the identified DNA sequences of the homoeologous copies of the AHAS genes were designed as previously described. See, e.g., Urnov et at, (2005) Nature 435:646-551. Exemplary target sequence and recognition helices are shown in Table 2 (recognition helix regions designs) and Table 3 (target sites). In Table 3, nucleotides in the target site that are contacted by the ZFP recognition helices are indicated in uppercase letters; non-contacted nucleotides are indicated in lowercase. Zinc Finger Nuclease (ZFN) target sites were designed for 4 regions in the AHAS gene: a region about 500-bp upstream of the serine 653 amino acid residue, an upstream region adjacent (within 30-bp) to the serine 653 amino acid residue, a downstream region adjacent (within 80-bp) to the serine 653 amino acid residue, and a region about 400-bp downstream of the serine 653 amino acid residue. Numerous ZFP designs were developed and tested to identify the fingers which bound with the highest level of efficiency with 22 different AHAS target sites which were identified in wheat as described in U.S. Provisional Patent Filing No. 61809097, incorporated herein by reference. The specific ZFP recognition helices (Table 2) which bound with the highest level of efficiency to the zinc finger recognition sequences were used for NHEJ-mediated targeting and integration of a donor sequence (homology-independent targeted integration) within the AHAS locus of the wheat genome. Table 2: AHAS zinc finger designs (N/A indicates "not applicable"

Table 3: Target site of AHAS zinc fingers

[0235] The AHAS zinc finger designs were incorporated into zinc finger expression vectors encoding a protein having at least one finger with a CCHC structure. See, U.S. Patent Publication No. 2008/0182332. In particular, the last finger in each protein had a CCHC backbone for the recognition helix. The non- canonical zinc finger-encoding sequences were fused to the nuclease domain of the type IIS restriction enzyme Fokl (amino acids 384-579 of the sequence of Wah et ah, (1998) Proc. Natl. Acad. Sci. USA 95:10564-10569) via a four amino acid ZC linker and an opaque-2 nuclear localization signal derived from Zea mays to form AHAS zinc-finger nucleases (ZFNs). See, U.S. Patent No. 7,888,121.

[0236] The optimal zinc fingers were verified for cleavage activity using a budding yeast based system previously shown to identify active nucleases. See, e.g., U.S. Patent Publication No. 2009/0111119; Doyon et al, (2008) Nat Biotechnology 26:702-708; Geurts et al., (2009) Science 325:433. Zinc fingers for the various functional domains were selected for in vivo use. Of the numerous ZFNs that were designed, produced and tested to bind to the putative AHAS genomic polynucleotide target site. The ZFNs described in Table 2 above, were identified as having in vivo activity at high levels, and were characterized as being capable of efficiently binding and cleaving the unique AHAS genomic polynucleotide target sites in planta.

Evaluation of zinc finger nuclease cleavage of AHAS genes using transient assays

[0237] ZFN construct assembly: Plasmid vectors containing ZFN gene expression constructs, which were identified using the yeast assay as previously described, were designed and completed using skills and techniques commonly known in the art. (see, for example, Ausubel or Maniatis). Each ZFN-encoding sequence was fused to a sequence encoding an opaque-2 nuclear localization signal (Maddaloni et al., (1989) Nuc. Acids Res. 17:7532), that was positioned upstream of the zinc finger nuclease.

[0238] Expression of the fusion proteins was driven by the strong constitutive promoter from the Zea mays Ubiquitin gene, (which includes the 5' untranslated region (UTR) (Toki et al, (1992) Plant Physiology 100; 1503-07). The expression cassette also included the 3' UTR (comprising the transcriptional terminator and polyadenylation site) from the Zea mays peroxidase 5 gene (Per5) gene(US Patent Publication No. 2004/0158887). The self-hydrolyzing 2A encoding the nucleotide sequence from Thosea asigna virus (Szymczak et al., (2004) Nat Biotechnol. 22:760- 760) was added between the two Zinc Finger Nuclease fusion proteins that were cloned into the construct.

[0239] The plasmid vectors were assembled using the FN-FUSION™

Advantage Technology (Clontech, Mountain View, CA). Restriction endonucleases were obtained from New England BioLabs (Ipswich, MA) and T4 DNA Ligase (Invitrogen, Carlsbad, CA) was used for DNA ligation. Plasmid preparations were performed using NUCLEOSPIN® Plasmid Kit (Macherey-Nagel Inc., Bethlehem, PA) or the Plasmid Midi Kit (Qiagen) following the instructions of the suppliers. DNA fragments were isolated using QIAQUICK GEL EXTRACTION KIT™

(Qiagen) after agarose tris-acetate gel electrophoresis. Colonies of all ligation reactions were initially screened by restriction digestion of miniprep DNA. Plasmid DNA of selected clones was sequenced by a commercial sequencing vendor (Eurofins MWG Operon, Huntsville, AL). Sequence data were assembled and analyzed using the SEQUENCHER™ software (Gene Codes Corp., Ann Arbor, MI).

[0240] Representative plasmids pDAB109350 and pDAB109360 are shown in Figure 11 and Figure 12 and were confirmed via restriction enzyme digestion and via DNA sequencing.

Preparation of ZFN constructs DNA for transfection

[0241] Before delivery to Triticum aestivum protoplasts, plasmid DNA for each ZFN construct was prepared from cultures of E. coli using the PURE YIELD PLASMID MAXIPREP SYSTEM® (Promega Corporation, Madison, WI) or PLASMID MAXI KIT® (Qiagen, Valencia, CA) following the instructions of the suppliers. Isolation of wheat mesophyll protoplasts

[0242] Mesophyll protoplasts from the wheat line cv. Bobwhite MPB26RH were prepared for transfection using polyethylene glycol (PEG)-mediated DNA delivery as follows.

[0243] Mature seed was surface sterilized by immersing in 80% (v/v) ethanol for 30 sees, rinsing twice with tap water, followed by washing in 20% DOMESTOS® (0.8%) v/v available chlorine) on a gyratory shaker at 140 rpm for 20 rnins. The DOMESTOS® was removed by decanting and the seeds were rinsed four times with sterile water. Excess water was removed by placing the seed on WHATMAN™ filter paper. The seeds were placed in a sterile PETRI™ dish on several sheets of dampened sterile WHATMAN™ filter paper and incubated for 24 h at 24°C. Following incubation, the seeds were surface sterilized a second time in 15% DOMESTOS® with 15 min shaking, followed by rinsing with sterile water as described previously. The seeds were placed on Murashige and Skoog (MS) solidified media for 24 hr at 24°C. Finally, the seeds were surface sterilized a third time in 10% DOMESTOS® with 10 min shaking, followed by rinsing in sterile water as previously described. The seeds were placed, crease side down, onto MS solidified media with 10 seeds per PETRI™ dish and germinated in the dark at 24°C for 14-21 days. [0244] About 2-3 grams of leaf material from the germinated seeds was cut into 2-3 cm lengths and placed in a pre-weighed PETRI™ dish. Leaf sheath and yellowing leaf material was discarded. Approximately 10 mL of leaf enzyme digest mix (0.6 M mannitol, 10 mM MES, 1.5% w/v cellulase R10, 0.3% w/v macerozyme, 1 mM CaCl₂, 0.1% bovine serum albumin, 0.025% v/v pluronic acid, 5 mM β- mercaptoethanol, pH 5.7) was pipetted into the PETRI™ dish and the leaf material was chopped transversely into 1-2 mm segments using a sharp scalpel blade. The leaf material was chopped in the presence of the leaf digest mix to prevent cell damage resulting from the leaf material drying out. Additional leaf enzyme digest mix was added to the PETRI™ dish to a volume of 10 mL per gram fresh weight of leaf material and subject to vacuum (20" Hg) pressure for 30 min. The PETRI™ dish was sealed with PARAFILM® and incubated at 28°C with gentle rotational shaking for 4- 5 hours.

[0245] Mesophyll protoplasts released from the leaf segments into the enzyme digest mix were isolated from the plant debris by passing the digestion suspension through a 100 micron mesh and into a 50 mL collection tube. To maximize the yield of protoplasts, the digested leaf material was washed three times. Each wash was performed by adding 10 mL wash buffer (20 mM KC1, 4 mM MES, 0.6 M mannitol, pH 5.6) to the PETRI™ dish, swirling gently for 1 min, followed by passing of the wash buffer through the 100 micron sieve into the same 50 mL collection tube. Next, the filtered protoplast suspension was passed through a 70 micron sieve, followed by a 40 micron sieve. Next, 6 mL aliquots of the filtered protoplast suspension were transferred to 12 mL round bottomed centrifugation tubes with lids and centrifuged at 70 g and 12°C for 10 min. Following centrifugation, the supernatant was removed and the protoplast pellets were each resuspended in 7 mL wash buffer. The protoplasts were pelleted a second time by centrifugation, as described above. The protoplasts were each resuspended in 1 mL wash buffer and pooled to two centrifugation tubes. The wash buffer volume was adjusted to a final volume of 7 mL in each tube before centrifugation was performed, as described above. Following removal of the supernatant, the protoplast pellets were resuspended in 1 mL wash buffer and pooled to a single tube. The yield of mesophyll protoplasts was estimated using a Neubauer haemocytometer. Evans Blue stain was used to determine the proportion of live cells recovered. PEG-mediated transfection of mesophyll protoplasts

[0246] About 10⁶ mesophyll protoplasts were added to a 12 mL round bottomed tube and pelleted by centrifugation at 70 g before removing the supernatant. The protoplasts were gently resuspended in 600 μΐ wash buffer containing 70 μg of plasmid DNA. The plasmid DNA consisted of the Zinc Finger Nuclease constructs described above. Next, an equal volume of 40% PEG solution (40% w/v PEG 4,000, 0.8 M mannitol, 1M Ca(N0₃)₂, pH 5.6) was slowly added to the protoplast suspension with simultaneous mixing by gentle rotation of the tube. The protoplast suspension was allowed to incubate for 15 min at room temperature without any agitation.

[0247] An additional 6 mL volume of wash buffer was slowly added to the protoplast suspension in sequential aliquots of 1 mL, 2mL and 3 mL. Simultaneous gentle mixing was used to maintain a homogenous suspension with each sequential aliquot. Half of the protoplast suspension was transferred to a second 12 mL round bottomed tube and an additional 3 mL volume of wash buffer was slowly added to each tube with simultaneous gentle mixing. The protoplasts were pelleted by centrifugation at 70 g for 10 min and the supernatant was removed. The protoplast pellets were each resuspended in 1 mL wash buffer before protoplasts from the paired round bottomed tubes were pooled to a single 12 mL tube. An additional 7 mL wash buffer was added to the pooled protoplasts before centrifugation as described above. The supernatant was completely removed and the protoplast pellet was resuspended in 2 mL Qiao's media (0.44% w/v MS plus vitamins, 3 mM MES, 0.0001% w/v 2,4-D, 0.6 M glucose, pH 5.7). The protoplast suspension was transferred to a sterile 3 cm PETRI™ dish and incubated in the dark for 24°C for 72 h. Genomic DNA isolation from mesophyll protoplasts

[0248] Transfected protoplasts were transferred from the 3 cm PETRI™ dish to a 2 mL microfuge rube. The cells were pelleted by centrifugation at 70 g and the supernatant was removed. To maximize the recovery of transfected protoplasts, the PETRI™ dish was rinsed three times with 1 mL of wash buffer. Each rinse was performed by swirling the wash buffer in the PETRI™ dish for 1 min, followed by transfer of the liquid to the same 2 ml microfuge tube. At the end of each rinse, the cells were pelleted by centrifugation at 70 g and the supernatant was removed. The pelleted protoplasts were snap frozen in liquid nitrogen before freeze drying for 24 h in a LABCONCO FREEZONE 4.5® (Labconco, Kansas City, MO) at -40°C and 133 x 10^"3 niBar pressure. The lyophilized cells were subjected to DNA extraction using the DNEASY® PLANT DNA EXTRACTION MINI kit (Qiagen) following the manufacturer's instructions, with the exception that tissue disruption was not required and the protoplast cells were added directly to the lysis buffer.

PCR assay of protoplast genomic DNA for ZFN sequence cleavage

[0249] To enable the cleavage efficacy and target site specificity of ZFNs designed for the AHAS gene locus to be investigated, PCR primers were designed to amplify up to a 300-bp fragment within which one or more ZFN target sites were captured. One of the primers was designed to be within a 100-bp window of the captured ZFN target site(s). This design strategy enabled Illumina short read technology to be used to assess the integrity of the target ZFN site in the transfected protoplasts. In addition, the PCR primers were designed to amplify the three homoeologous copies of the AHAS gene and to capture nucleotide sequence variation that differentiated between the homoeologs such that the Illumina sequence reads could be unequivocally attributed to the wheat sub-genome from which they were derived.

[0250] A total of four sets of PCR primers were designed to amplify the ZFN target site loci (Table 4). Each primer set was synthesized with the Illumina SPl and SP2 sequences at the 5' end of the forward and reverse primer, respectively, to provide compatibility with Illumina short read sequencing chemistry. The synthesized primers also contained a phosphorothioate linkage at the penultimate 5' and 3' nucleotides (indicated in Table 4 as an asterisk [*]). The 5' phosphorothioate linkage afforded protection against exonuclease degradation of the Illumina SPl and SP2 sequences, while the 3' phosphorothioate linkage improved PCR specificity for amplification of the target AHAS sequences using on-off PCR (Yang et al., (2005)). All PCR primers were designed to be between 18 and 27 nucleotides in length and to have a melting temperature of 60 to 65°C, optimal 63 °C.

[0251] In Table 4, nucleotides specific for the AHAS gene are indicated in uppercase type; nucleotides corresponding to the Illumina SPl and SP2 sequences are indicated in lowercase type. Each primer set was empirically tested for amplification of the three homoeologous AHAS gene copies through Sanger-based sequencing of the PCR amplification products.

Table 4: Primer sequences used to assess AHAS ZFN cleavage efficacy and target site specificity

asterisk (*) is used to indicate a phosphorothioate

[0252] PCR amplification of ZFN target site loci from the genomic DNA extracted from transfected wheat mesophyll protoplasts was used to generate the requisite loci specific DNA molecules in the correct format for Illumina-based sequencing-by-synthesis technology. Each PCR assay was optimized to work on 200 ng starting DNA (about 12,500 cell equivalents of the Triticum aestivum genome). Multiple reactions were performed per transfected sample to ensure sufficient copies of the Triticum aestivum genome were assayed for reliable assessment of ZFN efficiency and target site specificity. About sixteen PCR assays, equivalent to 200,000 copies of the Triticum aestivum genome taken from individual protoplasts, were performed per transfected sample. A single PCR master-mix was prepared for each transfected sample. To ensure optimal PCR amplification of the ZFN target site (i.e. to prevent PCR reagents from becoming limiting and to ensure that PCR remained in the exponential amplification stage) an initial assay was performed using a quantitative PCR method to determine the optimal number of cycles to perform on the target tissue. The initial PCR was performed with the necessary negative control reactions on a MX3000P THERMOCYCLER™ (Stratagene). From the data output gathered from the quantitative PCR instrument, the relative increase in fluorescence was plotted from cycle-to-cycle and the cycle number was determined per assay that would deliver sufficient amplification, while not allowing the reaction to become reagent limited, in an attempt to reduce over-cycling and biased amplification of common molecules. The unused master mix remained on ice until the quantitative PCR analysis was concluded and the optimal cycle number determined. The remaining master mix was then aliquoted into the desired number of reaction tubes (about 16 per ZFN assay) and PCR amplification was performed for the optimal cycle number. Following amplification, samples for the same ZFN target site were pooled together and 200 μΐ of pooled product per ZFN was purified using a QIAQUICK MINIELUTE PCR PURIFICATION KIT™ (Qiagen) following the manufacturer's instructions.

[0253] To enable the sample to be sequenced using Illumina short read technology, an additional round of PCR was performed to introduce the Illumina P5 and P7 sequences onto the amplified DNA fragments, as well as a sequence barcode index that could be used to unequivocally attribute sequence reads to the sample from which they originated. This was achieved using primers that were in part

complementary to the SP1 and SP2 sequences added in the first round of

amplification, but also contained the sample index and P5 and P7 sequences. The optimal number of PCR cycles required to add the additional sequences to the template without over-amplifying common fragments was determined by quantitative PCR cycle analysis, as described above. Following amplification, the generated product was purified using AMPURE MAGNETIC BEADS® (Beckman-Coulter) with a DNA-to-bead ratio of 1 : 1.7. The purified DNA fragment were titrated for sequencing by Illumina short read technology using a PCR-based library

quantification kit (KAPA) according the manufacturer's instructions. The samples were prepared for sequencing using a cBot cluster generation kit (Illumina) and were sequenced on an ILLUMINA GAII™ or HISEQ2000™ instrument (Illumina) to generate 100-bp paired end sequence reads, according to the manufacturer's instructions.

Data analysis for detecting NHEJ at target ZFN sites

[0254] Following generation of Illumina short read sequence data for sample libraries prepared for transfected mesophyll protoplasts, bioinformatics analysis was performed to identify deleted nucleotides at the target ZFN sites. Such deletions are known to be indicators of in planta ZFN activity that result from non-homologous end joining (NHEJ) DNA repair.

[0255] To identify sequence reads with NHEJ deletions, the manufacturer's supplied scripts for processing sequence data generated on the HISEQ2000™ instrument (Illumina) was used to first computationally assign the short sequence reads to the protoplast sample from which they originated. Sample assignment was based on the barcode index sequence that was introduced during library preparation, as described previously. Correct sample assignment was assured as the 6-bp barcode indexes used to prepare the libraries were differentiated from each other by at least a two-step sequence difference.

[0256] Following sample assignment, a quality filter was passed across all sequences. The quality filter was implemented in custom developed PERL script. Sequence reads were excluded if there were more than three ambiguous bases, or if the median Phred score was less than 20, or if there were three or more consecutive bases with a Phred score less than 20, or if the sequence read was shorter than 40 nucleotides in length.

[0257] Next, the quality trimmed sequences were attributed to the wheat sub- genome from which they originated. This was achieved using a second custom developed PERL script in which sub-genome assignment was detennined from the haplotype of the nucleotide sequence variants that were captured by the PCR primers used to amplify the three homoeologous copies of the AHAS gene, as described above.

[0258] Finally, the frequency of NHEJ deletions at the ZFN cleavage site in the sub-genome-assigned sequence reads was determined for each sample using a third custom developed PERL script and manual data manipulation in Microsoft Excel 2010 (Microsoft Corporation). This was achieved by counting the frequency of unique NHEJ deletions on each sub-genome within each sample.

[0259] Two approaches were used to assess the cleavage efficiency and specificity of the ZFNs tested. Cleavage efficiency was expressed (in parts per million reads) as the proportion of sub-genome assigned sequences that contained a NHEJ deletion at the ZFN target site. Rank ordering of the ZFNs by their observed cleavage efficiency was used to identify ZFNs with the best cleavage activity for each of the four target regions of the AHAS genes in a sub-genome-specific manner.

[0260] All of the ZFNs tested showed NHEJ deletion size distributions consistent with that expected for in planta ZFN activity. Cleavage specificity was expressed as the ratio of cleavage efficiencies observed across the three sub-genomes. The inclusion of biological replicates in the data analyses did not substantially affect the rank order for cleavage activity and specificity of the ZFNs tested.

[0261] From these results, the ZFNs encoded on plasmid pDAB109350 (i.e. ZFN 29732 and 29730) and pDAB109360 (i.e. ZFN 30012 and 30018) were selected for in planta targeting in subsequent experiments, given their characteristics of significant genomic DNA cleavage activity in each of the three wheat sub-genomes.

Evaluation of donor designs for ZFN-mediated AHAS gene editing using transient assays

[0262] To investigate ZFN-mediated genomic editing at the endogenous

AHAS gene locus in wheat, a series of experiments were undertaken to assess the effect of donor design on the efficiency of non-homologous end joining (NHEJ)- directed DNA repair. These experiments used transient assays to monitor the efficiency for ZFN-mediated addition of the previously described S653N mutation conferring tolerance to imidazolinone class herbicides (Li et al., (2008) Molecular Breeding 22:217-225) at the endogenous AHAS gene locus in wheat, or alternatively for ZFN-mediated introduction of an EcoRl restriction endonuclease sequence site at the double strand DNA break created in the endogenous AHAS genes by targeted ZFN cleavage. Donor designs for NHEJ-directed DNA repair

[0263] Two types of donor DNA designs were used for NHEJ-directed DNA repair.

[0264] The first type of donor design was a linear, double stranded DNA molecule comprising 41 -bp of sequence that shared no homology with the

endogenous AHAS genes in wheat. Two donor DNA molecules were designed, each to target the three homoeologous copies of the AHAS gene. Both donor DNA molecules had protruding 5' and 3' ends to provide ligation overhangs to facilitate ZFN-mediated NHEJ-directed DNA repair. The two donor DNA molecules differed by the sequence at their protruding 3 ' end. The first donor DNA molecule, pDAS000152 (SEQ ID NO:175 and SEQ ID NO:176), was designed to provide ligation overhangs that were compatible with those generated by cleavage of the endogenous AHAS genes by ZFNs 29732 and 29730 (encoded on plasmid pDAB109350) and to result in the insertion of the 41 -bp donor molecule into the endogenous AHAS gene at the site of the double strand DNA break via NHEJ- directed DNA repair. The second donor DNA molecule pDAS000149 (SEQ ID NO: 177 and SEQ ID NO: 178) was designed to provide ligation overhangs that were compatible with those generated by the dual cleavage of the endogenous AHAS genes by ZFNs 29732 and 29730 (encoded on plasmid pDAB109350) and ZFNs 30012 and 30018 (encoded on plasmid pDAB109360) and to result in the replacement of the endogenous AHAS sequence contained between the two double strand DNA breaks created by the ZFNs with the 41 -bp donor molecule via NHEJ-directed DNA repair.

[0265] The second type of donor was a plasmid DNA vector containing 41 -bp of sequence that shared no homology with the endogenous AHAS genes in wheat and that was flanked on either side by sequence that was recognized by the ZFN(s) used to create double strand DNA breaks in the endogenous AHAS genes. This donor design allowed in planta release of the unique 41 -bp sequence from the plasmid DNA molecule by the same ZFN(s) used to cleave target sites in the endogenous AHAS genes, and simultaneous generation of protruding ends that were suitable for overhang ligation of the released 41 -bp sequence into the endogenous AHAS genes via NHEJ- directed DNA repair. Two plasmid donor DNA molecules were designed, each to target the three homoeologous copies of the AHAS gene. The first plasmid donor molecule, pDAS000153 (SED ID NO:179 and SEQ ID NO:180) (Figure 13), was designed to provide ligation overhangs on the released 41 -bp DNA fragment that were compatible with those generated by cleavage of the endogenous AHAS genes by ZFNs 29732 and 29730 (encoded on plasmid pDAB109350). The second plasmid donor molecule, pDAS000150 (SEQ ID NO:181 and SEQ ID NO:182) (Figure 14), was designed to provide ligation overhangs on the released 41 -bp DNA fragment that were at one end compatible with those generated by ZFNs 29732 and 29730 (encoded on plasmid pDAB109350) and at the other end compatible with those generated by ZFNs 30012 and 30018 (encoded on plasmid pDAB109360). This design allowed the replacement of the endogenous AHAS sequence contained between the two double strand DNA breaks created by ZFNs 29732 and 29730 and ZFNs 30012 and 30018 with the 41-bp donor molecule sequence.

Synthesis of donor DNA for NHEJ-directed DNA repair

[0266] Standard cloning methods commonly known by one skilled in the art were used to build the plasmid vectors. Before delivery to Triticum aestivum, plasmid DNA for each donor construct was prepared from cultures of E. coli using the PURE YIELD PLASMID MAXIPREP SYSTEM® (Promega Corporation, Madison, WI) or PLASMID MAXI KIT® (Qiagen, Valencia, CA) following the instructions of the suppliers.

[0267] Standard phosphoramidite chemistry was used to synthetically synthesize the double stranded DNA donor molecules (Integrated DNA Technologies, Coralville, IA). For each donor molecule, a pair of complementary single stranded DNA oligomers was synthesized, each with two phosphorothioate linkages at their 5' ends to provide protection against in planta endonuclease degradation. The single stranded DNA oligomers were purified by high performance liquid chromatography to enrich for full-length molecules and purified of chemical carryover from the synthesis steps using Na⁺ exchange. The double stranded donor molecule was formed by annealing equimolar amounts of the two complementary single-stranded DNA oligomers using standard methods commonly known by one skilled in the art. Before delivery to Triticum aestivum, the double stranded DNA molecules were diluted to the required concentration in sterile water. Isolation of heat protoplasts derived from somatic embryogenic callus

[0268] Protoplasts derived from somatic embryogenic callus (SEC) from the donor wheat line cv. Bobwbite MPB26RH were prepared for transfection using polyethylene glycol (PEG)-mediated DNA delivery as follows:

[0269] Seedlings of the donor wheat line were grown in an environment controlled growth room maintained at 18/16°C (day/night) and a 16/8 hour

(day/night) photoperiod with lighting provided at 800 mmol m per sec. Wheat spikes were collected at 12-14 days post-anthesis and were surface sterilized by soaking for 1 min in 70% (v/v) ethanol. The spikes were threshed and the immature seeds were sterilized for 15 min in 17% (v/v) bleach with gentle shaking, followed by rinsing at least three times with sterile distilled water. The embryos were aseptically isolated from the immature seeds under a dissecting microscope. The embryonic axis was removed using a sharp scalpel and discarded. The scutella were placed into a 9 cm PETRI™ dish containing 2-4 medium without TIMENTIN™, with the uncut scutellum oriented upwards. A total of 25 scutella were plated onto each 9 cm

PETRI™ dish. Somatic embryogenic callus (SEC) formation was initiated by incubating in the dark at 24°C for 3 weeks. After 3 weeks, SEC was separated from non-embryogenic callus, placed onto fresh 2-4 medium without TIMENTIN™ and incubated for a further 3 weeks in the dark at 24°C. Sub-culturing of SEC was repeated for a total of three times before being used for protoplast preparation.

[0270] About one gram of SEC was chopped into 1-2 mm pieces using a sharp scalpel blade in a 10 cm PETRI™ dish contained approximately 10 mL of wheat callus digest mix (2.5% w/v Cellulase RS, 0.2% w/v pectolyase Y23, 0.1% w/v DRISELASE®, 14 mM CaCl₂, 0.8 mM MgS0₄, 0.7 mM KH₂P0₄, 0.6 M Mannitol, pH 5.8) to prevent the callus from dehydrating. Additional callus digest mix was added to the PETRI™ dish to a volume of 10 mL per gram fresh weight of callus and subject to vacuum (20" Hg) pressure for 30 min. The PETRI™ dish was sealed with PARAFILM® and incubated at 28°C with gentle rotational shaking at 30-40 rpm for 4-5 hours.

[0271] SEC protoplasts released from the callus were isolated by passing the digestion suspension through a 100 micron mesh and into a 50 mL collection tube. To maximize the yield of protoplasts, the digested callus material was washed three times. Each wash was performed by adding 10 mL SEC wash buffer (0.6 M Mannitol, 0.44% w/v MS, pH 5.8) to the PETRI™ dish, swirling gently for 1 min, followed by passing of the SEC wash buffer through the 100 micron sieve into the same 50 mL collection tube. Next, the filtered protoplast suspension was passed through a 70 micron sieve, followed by a 40 micron sieve. Next, 6 mL aliquots of the filtered protoplast suspension were transferred to 12 mL round bottomed centrifugation tubes with lids and centrifuged in at 70 g and 12°C for 10 min. Following centrifugation, the supernatant was removed, leaving approximately 0.5 mL supernatant behind, and the protoplast pellets were each resuspended in 7mL of 22% sucrose solution. The sucrose/protoplast mixture was carefully overlaid with 2 mL SEC wash buffer, ensuring that there was no mixing of the two solutions. The protoplasts were centrifuged a second time by centrifugation, as described above. The band of protoplasts visible between the SEC wash buffer and sucrose solution was collected using a pipette and placed into a clean 12 mL round bottom tube. Seven mL of SEC wash buffer was added to the protoplasts and the tubes were centrifuged, as described above. The supernatant was removed and the SEC protoplasts were combined to a single tube and resuspended in a final volume 1-2 mL of SEC wash buffer. The yield of SEC protoplasts was estimated using a Neubauer haemocytometer. Evans Blue stain was used to determine the proportion of live cells recovered. PEG-mediated transfection of SEC protoplasts

[0272] About two million SEC protoplasts were added to a 12 mL round bottomed tube and pelleted by centrifugation at 70 g before removing the supernatant. The protoplasts were gently resuspended in 480 μΐ SEC wash buffer containing 70 μg of DNA. The DNA consisted of the Zinc Finger Nuclease and donor DNA constructs described above, with each construct present at the molar ratio required for the experiment being undertaken. Next, 720 μΐ of 50% PEG solution (50% w/v PEG 4000, 0.8 M mannitol, 1M Ca(N0₃)₂, pH 5.6) was slowly added to the protoplast suspension with simultaneous mixing by gentle rotation of the tube. The protoplast suspension was allowed to incubate for 15 min at room temperature without any agitation.

[0273] An additional 7 mL volume of SEC wash buffer was slowly added to the protoplast suspension in sequential aliquots of 1 mL, 2 mL and 3 mL.

Simultaneous gentle mixing was used to maintain a homogenous suspension with each sequential aliquot. Half of the protoplast suspension was transferred to a second 12 mL round bottomed tube and an additional 3 mL volume of SEC wash buffer was slowly added to each tube with simultaneous gentle mixing. The protoplasts were pelleted by centrifugation at 70 g for 10 min and the supernatant was removed. The protoplast pellets were each resuspended in 1 mL SEC wash buffer before protoplasts from the paired round bottomed tubes were pooled to a single 12 mL tube. An additional 7 mL SEC wash buffer was added to the pooled protoplasts before centrifugation as described above. The supernatant was completely removed and the protoplast pellet was resuspended in 2 mL Qiao's media. The protoplast suspension was transferred to a sterile 3 cm PETRI™ dish and incubated in the dark for 24°C for 72 h.

Isolation of scutella from πηπιβίμΓε zygotic wheat embryos

[0274] Scutella of immature zygotic wheat embryos from the donor wheat line cv. Bobwhite MPB26RH were prepared for transfection using biolistics-mediated DN A delivery as follows.

[0275] Seedlings of the donor wheat line were grown in an environment controlled growth room maintained at 18/16°C (day/night) and a 16/8 hour

(day/night) photoperiod with lighting provided at 800 mmol m per sec. Wheat spikes were collected at 12-14 days post-anthesis and were surface sterilized by soaking for 1 min in 70% (v/v) ethanol. The spikes were threshed and the immature seeds were sterilized for 15 min in 17% (v/v) bleach with gentle shaking, followed by rinsing at least three times with sterile distilled water. The embryos were aseptically isolated from the immature seeds under a dissecting microscope. The embryonic axis was removed using a sharp scalpel and discarded. The scutella were placed into a 9 cm PETRI™ dish containing osmotic MS (E3 maltose) medium, with the uncut scutellum oriented upwards. A total of 20 scutella were plated onto each 9 cm PETRI™ dish. The prepared embryos were pre-cultured in the dark at 26°C for a minimum of 4 h before transfection using biolistics-mediated DNA delivery. Transfection of scutella of immature zygotic wheat embryos by biolistic- mediated DNA delivery

[0276] Gold particles for biolistic-mediated DNA delivery were prepared by adding 40 mg of 0.6 micron colloidal gold particles (BioRad) to 1 mL of sterile water in a 1.5 mL microtube. The gold particles were resuspended by vortexing for 5 min. To prepare sufficient material for 10 bombardments, a 50 μΐ, aliquot of the gold particle suspension was transferred to a 1.5 mL microtube containing 5 μg of DNA resuspended in 5 μΐ, of sterile water. Following thorough mixing by vortexing, 50 μΐ_^ of 2.5 M CaCl₂ and 20 μΐ, of 0.1 M spermidine were added to the microtube, with thorough mixing after the addition of each reagent. The DNA-coated gold particles were pelleted by centrifugation for 1 min at maximum speed in a bench top microfuge. The supernatant was removed and 1 mL of 100% ethanol was added to wash and resuspend the gold particles. The gold particles were pelleted by

centrifugation, as described above, and the supernatant discarded. The DNA-coated gold particles were resuspended in 110 μί of 100% ethanol and maintained on ice. Following a brief vortex, 10 μί_^ of the gold particle solution was placed centrally onto a macro-carrier membrane and allowed to air dry.

[0277] The PDS- 1000/HE PARTICLE GUN DELIVERY SYSTEM™

(BioRad) was used to transfect the scutella of immature zygotic wheat embryos by biolistic-mediated DNA delivery. Delivery of the DNA-coated gold particles was performed using the following settings: gap 2.5 cm, stopping plate aperture 0.8 cm, target distance 6.0 cm, vacuum 91.4 - 94.8 kPa, vacuum flow rate 5.0 and vent flow rate 4.5. The scutella of immature zygotic wheat embryos were bombarded using a 900 psi rupture disc. Each PETRI™ dish containing 20 scutella was bombarded once. The bombarded scutella were incubated at 26°C in the dark for 16 h before being transferred onto medium for callus induction. The scutella were cultured on callus induction medium in the dark at 26°C for 7 d. Genomic DNA isolation from SEC protoplasts

[0278] Genomic DNA was extracted from SEC protoplasts using the procedure previously described for mesophyll protoplasts. An additional purification step was performed to reduce the presence of the donor DNA used for transfection. This was achieved using gel electrophoresis to separate the genomic DNA from the SEC protoplasts from the donor DNA used for transfection. The extracted DNA was electrophoresed for 3 h in a 0.5% agarose gel using 0.5X TBE. The DNA was visualized by SYBR® SAFE staining and the band corresponding to genomic DNA from the SEC protoplasts was excised. The genomic DNA was purified from the agarose gel using a QIAQUICK DNA PURIFICATION KIT™ (Qiagen), following the manufacturer's instructions, except that the QIAQUICK™ DNA purification column was replaced with a DNA binding column from the DNEASY PLANT DNA EXTRACTION MINI KIT™ (Qiagen).

Genomic DNA isolation from scutella of immature zygotic embryos

[0279] The 20 scutella of immature zygotic wheat embryos transfected for each biolistic-mediated DNA delivery were transferred to a 15 ml tube and snap frozen in liquid nitrogen before freeze drying for 24 h in a LABCONCO FREEZONE 4.5® (Labconco, Kansas City, MO) at -40°C and 133 x 10^"3 mBar pressure. The lyophilized calli were subjected to DNA extraction using the DNEASY® PLANT DNA EXTRACTION MAXI™ KIT (Qiagen) following the manufacturer's instructions.

[0280] An additional purification step was performed to reduce the presence of the donor DNA used for transfection. This was achieved using gel electrophoresis to separate the genomic DNA from the calli from the donor DNA used for transfection. The extracted DNA was electrophoresed for 3 h in a 0.5% agarose gel using 0.5X TBE. The DNA was visualized by SYBR® SAFE staining and the band corresponding to genomic DNA from the calli was excised. The genomic DNA was purified from the agarose gel using a QIAQUICK™ DNA PURIFICATION kit

(Qiagen), following the manufacturer's instructions, except that the QIAQUICK™ DNA purification column was replaced with a DNA binding column from the DNEASY® PLANT DNA EXTRACTION MAXI™ KIT (Qiagen). PCR assay of genomic DNA for ZFN-mediated AHAS editing

[0281] To investigate ZFN-mediated genomic editing at the endogenous

AHAS genes in wheat using NHEJ-directed DNA repair, and assess the effect of donor DNA design on the efficacy of each DNA repair pathway, PCR assays were used to amplify the target AHAS regions from genomic DNA of transfected wheat cells. PCR assays were performed as described previously to generate requisite loci specific DNA molecules in the correct format for Illumina-based sequencing-by- synthesis technology. Each assay was performed using the previously described primer pair (SEQ ID NO: 160 and SEQ ID NO: 170) that were designed to amplify the region targeted by ZFNs 29732 and 29730 (encoded on plasmid pDAB 109350) and ZFNs 30012 and 30018 (encoded on plasmid pDAB109360) for each of the three homoeologous copies of the AHAS genes. Multiple reactions were performed per transfected sample to ensure that sufficient copies of the Triticum aestivum genome were assayed for reliable assessment of ZFN-mediated gene editing. For transfected SEC protoplasts, up to sixteen PCR assays, equivalent to 200,000 copies of the Triticum aestivum genome taken from individual protoplasts, were performed per transfected sample. For transfected scutella of immature zygotic embryos, about forty eight (48) PCR assays, equivalent to 600,000 copies of the Triticum aestivum genome taken from individual protoplasts, were performed per transfected sample. Each transfected sample was prepared for sequencing using a CBOT CLUSTER

GENERATION KIT™ (Illumina) and was sequenced on an ILLUMINA GAII_X™ or HISEQ2000™ instrument (Illumina) to generate 100-bp paired end sequence reads, as described previously.

Data analysis for detecting ZFN-mediated NHEJ-directed editing at AHAS genes

[0282] Following generation of Illumina short read sequence data for sample libraries prepared for transfected SEC protoplasts and scutella of immature zygotic wheat embryos, analyses were performed to identify molecular evidence for ZFN- mediated NHEJ-directed editing at the target ZFN sites.

[0283] To identify sequence reads with molecular evidence for NHEJ-directed gene editing, the short sequence reads were first computationally processed, as previously described, to assign each read to the sample and sub-genome from which they originated, and to perform quality filtering to ensure that only high quality sequences were used for subsequent analyses. Next, custom developed PERL scripts and manual data manipulation in Microsoft Excel 2010 (Microsoft Corporation) was used to identify reads that contained sequence for both the donor DNA molecule used for transfection and the endogenous AHAS locus. The editing frequency (expressed in parts per million reads) was calculated as the proportion of sub-genome-assigned sequence reads that showed evidence for ZFN-mediated NHEJ-directed gene editing.

[0284] From the results of three biological replicates performed for each linear double stranded DNA donor design, molecular evidence was obtained for the enrichment of sequence reads showing ZFN-mediated NHEJ-directed editing at the three homoeologous copies of the endogenous AHAS genes in wheat (Table 7 and Table 8). Strong molecular evidence was obtained for the integration of the linear, double-stranded 41 -bp donor molecule at the position of the double strand DNA break created by cleavage of the homoeologous copies of the AHAS gene by ZFNs 29732 and 29730 in samples of both SEC protoplasts and scutella of immature zygotic embryos that were transfected with pDAB109350 and pDAS000152. Similar editing efficiency was observed across the three wheat sub-genomes in these samples. In contrast, samples of SEC protoplasts and scutella of immature zygotic embryos transfected with pDAB109350 and pDAS000153 showed poor evidence for ZFN- mediated NHEJ-directed gene editing, presumably due to the prerequisite requirement for in planta release of the 41 -bp donor sequence from the plasmid backbone.

Molecular evidence for the replacement of endogenous AHAS sequence with the 41- bp donor molecule was observed in both SEC protoplasts and scutella of immature zygotic embryos that were transfected with pDAB109350, pDAB109360 and pDAS000149. However, the frequency of editing was significantly lower than that observed for transfections performed using pDAB109350 and pDAS000152, presumably due to the requirement for dual ZFN cleavage of the endogenous AHAS sequence. Limited evidence was obtained for the replacement of endogenous AHAS sequence with the 41 -bp donor molecule that required in planta release from plasmid backbone in samples of SEC protoplast and scutella of immature zygotic embryos that were transfected with pDAB109350, pDAB109360 and pDAS000150.

Table 7: Average NHEJ editing frequency in parts per million (ppm) across three biological replicates of scutella transfected with linear double-stranded

29730

29732-2A- pDAS000150 29730 10:1 D 0

Table 8: Average NHEJ editing frequency in parts per million (ppm) across three biological replicates of SEC protoplast transfected with linear double- stranded donor DNA desi ns, "na" indicates "not applicable."

[0285] Collectively, the results provide strong molecular evidence for precise

ZFN-mediated NHEJ-directed editing at the endogenous AHAS gene locus in wheat. These results show that all three sub-genomes can be targeted with a single ZFN and donor. The results clearly demonstrate a higher frequency of editing for linear donor DNA designs as compared to plasmid donor DNA designs. Presumably, these results are due to the prerequisite requirement for in planta linearization of the plasmid donor molecules before they can participate in NHEJ-directed DNA repair. The results also indicate that sub-genome-specific mediated NHEJ-directed gene editing is facilitated by a double strand break. The ZFNs that were designed to induce the double strand DNA breaks resulted in a sub-genome-specific mediated NHEJ-directed gene editing when delivered with the donor DNA to the Triticum aestivum plant cells.

Development of a transformation system for producing AHAS edited plants

[0286] The endogenous AHAS gene locus in wheat was selected as a model locus to develop a transformation system for generating plants with precise genome modifications induced by ZFN-mediated gene editing. The endogenous AHAS gene was selected as a model locus due to its ability to produce a selectable phenotype ( . e. , tolerance to group B herbicides - ALS inhibitors), knowledge of prerequisite information of sub-genome-specific gene coding sequence, and knowledge of specific mutations conferring tolerance to group B herbicides from the characterization of wheat with chemically induced mutations in the AHAS genes. The S653N mutation conferring tolerance to imidazolinone class herbicide was chosen as a target for ZFN- mediated gene editing due to the availability of commercially released wheat varieties carrying the S653N mutation that could be used as positive controls to develop a chemical selection system to enrich for precisely edited events.

Molecular characterization of Triticum aestivum cv. Clearfield Janz

[0287] Triticum aestivum cv. Clearfield Janz, a commercially released bread wheat variety carrying the S653N mutation in the D-genome, was selected for use as a positive control to develop a chemical selection strategy to enrich for AHAS edited wheat plants produced by ZFN-mediated gene editing. To generate a pure genetic seed stock, 48 seedlings were screened with 96 microsatellite (SSR) markers using Multiplex-Ready PCR technology (Hayden et al, (2008) BMC Genomics 9;80). Seedlings with identical SSR haplotypes were used to produce seed that was used in subsequent experiments.

[0288] To ensure that the wheat plants used to produce seed carried the S653N mutation, a PCR assay was developed to amplify the region of the AHAS gene carrying the mutation from the D-genome of wheat. Sub-genome-specific

amplification was achieved using on-off PCR (Yang et al, (2005) Biochemical and Biophysical Research Communications 328:265-72) with primers AHAS-PS-6DF2 and AHAS-PS-6DR2 (SEQ ID NO: 183 and SEQ ID NO:184) designed to position the penultimate base (which contained a phosphorothioate linkage) over nucleotide sequence variation that distinguished between the homoeologous copies of the AHAS genes. The PCR primers were designed to be between 18 and 27 nucleotides in length and to have a melting temperature of 60 to 65°C, optimal 63°C. The amplified PCR products were purified using a QIAQUICK MINIELUTE PCR PURIFICATION KIT™ (Qiagen) and sequenced using a direct Sanger sequencing method. The sequencing products were purified with ethanol, sodium acetate and EDTA following the BIGDYE® v3.1 protocol (Applied Biosystems) and electrophoresis was performed on an ABI3730XL® automated capillary electrophoresis platform.

[0289] Analysis of the amplified AHAS gene sequences using

SEQUENCHER v3.7™ (GeneCodes, Ann Arbor, MI) revealed segregation for the S653N mutation and enabled the identification of plants that were homozygous (N653/N653) and heterozygous (N653/S653) for the S653N mutation or homozygous (S653/S653) for the herbicide-susceptible allele. The harvest of seed from individual plants provided a seed source having different levels of zygosity for the S653N mutation in the cv. Clearfield Janz genetic background.

Optimization of chemical selection conditions based on IMAZAMOX™

[0290] A series of experiments were performed to determine optimal selection conditions for regenerating AHAS edited wheat plants. These experiments were based on testing the basal tolerance to IMAZAMOX™ of the donor wheat line cv. Bobwhite MPB26RH (S653/S653 genotype) at the callus induction, plant regeneration and rooting stages of an established wheat transformation system. Similar experiments were performed to determine the basal tolerance and resistance of cv. Clearfield Janz genotypes carrying the different doses of the S653N mutation; i.e., plants with N653/N653 and S653/S653 genotypes.

[0291] The basal tolerance of the donor wheat line cv. Bobwhite MPB26RH and basal resistance of cv. Clearfield Janz N653/N653) genotype to IMAZAMOX® at the callus induction stage was determined as follows: Scutella of immature zygotic embryos from each wheat line were isolated as described previously and placed in 10 cm PETRI™ dishes containing CIM medium supplemented with 0, 50, 100, 200, 300, 400 and 500 nM IMAZAMOX® respectively. Twenty scutella were placed in each PETRI™ dish. A total of 60 scutella from each of the donor wheat line cv. Bobwhite MPB26RH and cv. Clearfield Janz genotype were tested for basal tolerance and basal resistance response, respectively, at each IMAZAMOX® concentration. After incubation at 24°C in the dark for 4 weeks, the amount of somatic embryogenic callus formation (SEC) at each IMAZAMOX® concentration was recorded. The results showed that SEC formation for cv. Bobwhite MPB26RH was reduced by about 70% at 100 nM IMAZAMOX®, compared to untreated samples. Callus formation for the cv. Clearfield Janz genotype was unaffected, relative to the untreated control, at any IMAZAMOX® concentrations tested.

[0292] The basal tolerance of the donor wheat line cv. Bobwhite MPB26RH to IMAZAMOX® at the plant regeneration stage was determined as follows: Scutella of immature zygotic embryos from the donor wheat line were isolated as described previously and placed in 10 cm PETRI™ dishes containing CIM medium. Somatic embryogenic callus was allowed to form by incubating at 24°C in the dark for 4 weeks. The SEC was transferred to 10 cm PETRI™ dishes containing DRM medium supplemented with 0, 100, 200, 300, 400, 500 and 1000 nM IMAZAMOX® respectively. Twenty CIM were placed in each PETRI™ dish. A total of 60 CIM were tested for basal tolerance response at each IMAZAMOX® concentration. After incubation for 2 weeks at 24°C under a 16/8 (light/dark) hour photoperiod in a growth room, the regeneration response was recorded. The results showed that plant regeneration was reduced by about 80% at 200 nM IMAZAMOX®, compared to untreated samples.

[0293] The basal tolerance of the cv. Clearfield Janz (S653/S653) genotype and basal resistance of the cv. Clearfield Janz (N653/N653) genotype to

IMAZAMOX® at the plant regeneration stage was determined using a modified approach, as cv. Clearfield Janz was observed to have poor plant regeneration response (i. e. , poor embryogenesis) in tissue culture. Seed for each cv. Clearfield Janz genotype was germinated using the aseptic approach described above for producing wheat mesophyll protoplasts. The germinated seedlings were multiplied in vitro by sub-culturing on multiplication medium. Following multiplication, plants for each genotype were transferred to 10 cm PETRI™ dishes containing plant growth medium (MS +10 μΜ BA +0.8% agar) supplemented with 0 , 100, 300, 600, 900, 1200, 1500 and 3000 nM IMAZAMOX®, respectively. Ten plants were placed in each PETRI™ dish. A total of 30 plants per genotype were tested for basal response at each

IMAZAMOX® concentration. After incubation for 3 weeks at 24°C under a 16/8 (light/dark) hour photoperiod in a growth room, the growth response was recorded. The results showed that plant growth for the cv. Clearfield Janz (S653/S653) genotype was severely reduced in medium containing at least 200 nM

IMAZAMOX®, compared to untreated samples. This response was similar to that observed for the cv. Bobwhite MPB26RH (S653/S653) genotype. In contrast, plant growth for the cv. Clearfield Janz (N653/N653) genotype was not strongly suppressed, relative to untreated samples, until the IMAZAMOX® concentration exceeded 2,000 nM.

[0294] The basal tolerance of the donor wheat line cv. Bobwhite MPB26RH to IMAZAMOX® at the plant rooting stage was determined as follows: Scutella of immature zygotic embryos from the donor wheat line were isolated as described previously and placed in 10 cm PETRI™ dishes containing CIM medium. Somatic embryogenic callus was allowed to form by incubating at 24°C in the dark for 4 weeks. The SEC was transferred to 10 cm PETRI™ dishes containing DRM medium and incubated for 2 weeks at 24°C under a 16/8 (light/dark) hour photoperiod to allow plant regeneration to take place. Regenerated plants were transferred to 10 cm PETRI™ dishes containing RM medium supplemented with 0, 100, 200, 300, 400, 500 nM IMAZAMOX®, respectively. Twenty regenerated plants were placed in each PETRI™ dish. A total of 60 regenerated plants were tested for basal tolerance response at each IMAZAMOX® concentration. After incubation for 3 weeks at 24°C under a 16/8 (light/dark) hour photoperiod in a growth room, the root formation response was recorded. The results showed that root formation was severely restricted at all concentrations of IMAZAMOX® tested, compared to untreated samples. [0295] The basal tolerance of the cv. Clearfield Janz (S653/S653) genotype and basal resistance of the cv. Clearfield Janz (N653/N653) genotype to

IMAZAMOX® at the plant rooting stage was determined using a modified approach, as cv. Clearfield Janz was observed to have poor plant regeneration response (i.e., poor embryogenesis) in tissue culture. Seed for each cv. Clearfield Janz genotype was germinated using the aseptic approach described above for producing wheat mesophyll protoplasts. The germinated seedlings were multiplied in vitro by sub- culturing on multiplication medium. Following multiplication, plants for each genotype were transferred to 10 cm PETRI™ dishes containing plant rooting medium (1/2 MS, 0.5 mg/L NAA, 0.8% agar) supplemented with 0, 50, 100, 200 and 250 rJVI IMAZAMOX®, respectively. Three plants were placed in each PETRI™ dish. A total of 6 plants per genotype were tested for basal response at each IMAZAMOX® concentration. After incubation for 2 weeks at 24°C under a 16/8 (light/dark) hour photoperiod in a growth room, the root formation response was recorded.

[0296] The results showed that root formation for the cv. Clearfield Janz

(N653/N653) genotype was restricted, compared to untreated samples, at 250 nM IMAZAMOX®. Root formation was severely restricted in the cv. Clearfield Janz (S653/S653) genotype at all concentrations of IMAZAMOX® tested, compared to untreated samples.

Design and Synthesis of Donor DNA for ZFN-Mediated AHAS Gene Editing

[0297] Donor DNA molecules were designed to promote precise ZFN- mediated NHEJ-directed gene editing at the endogenous AHAS genes in wheat. The donor designs allowed for the introduction of the S653N mutation known to confer tolerance to imidazolinone class herbicides (Li et al, (2008) Molecular Breeding 22:217-225).

[0298] The first design was based on the integration of a 95-bp double stranded donor molecule at the position of the double strand DNA break created by cleavage of a homoeologous copy of the endogenous AHAS gene by ZFNs 29732 and 29730 (encoded on plasmid pDABl 09350). The donor DNA molecule, pDAS000267 (SEQ ID NO:423 and SEQ ID NO:424), comprised two portions of the integrating donor polynucleotide. The 5' end contained sequence near identical to the endogenous AHAS gene encoded in the D-genome, starting from the target ZFN cleavage site and finishing at the AHAS stop codon. Six intentional mutations were introduced into this sequence: two mutations encoded the S653N mutation (AGC->AAT), and four mutations were synonymous (in which a silent mutation was incorporated into the donor sequence). The 3' end of the donor molecule contained a unique sequence that could be used for diagnostic PCR to detect ZFN-mediated NHEJ-directed gene editing events. The donor molecule was designed with protruding 5' and 3' ends to provide ligation overhangs to facilitate ZFN-mediated NHEJ-directed DNA repair.

[0299] The second design was based on replacement of the endogenous

AHAS sequence located between a pair of ZFN target sites with a 79-bp double stranded donor molecule. Specifically, the donor was designed to replace the endogenous AHAS sequence released from chromatin upon dual cleavage of a homoeologous copy of the AHAS gene by ZFNs 29732 and 29730 (encoded on plasmid pDAB109350) and ZFNs 30012 and 30018 (encoded on plasmid

pDAB109360). The donor molecule, pDAS000268 (SEQ ID NO:425 and SEQ ID NO:426), comprised sequence near identical to the endogenous AHAS gene encoded in the D-genome, starting from the cleavage site for ZFNs 29732 and 29730, and finishing at the cleavage site for ZFNs 30012 and 30018. Ten deliberate mutations were introduced into this sequence. Six mutations were located at the 5' end of the donor: two mutations encoded the S653N mutation (AGC->AAT) and four mutations were synonymous. Four mutations were located at the 3' end of the donor and were located in non-coding sequence. The donor molecule was designed with protruding 5' and 3' ends to provide ligation overhangs to facilitate ZFN-mediated NHEJ-directed DNA repair.

[0300] Standard phosphoramidite chemistry was used to synthetically synthesize the double stranded DNA donor molecules (Integrated DNA

Technologies). For each donor molecule, a pair of complementary single stranded DNA oligomers was synthesized, each with two phosphorothioate linkages at their 5' ends to provide protection against in planta endonuclease degradation. The single stranded DNA oligomers were purified by high performance liquid chromatography to enrich for full-length molecules and purified of chemical carryover from the synthesis steps using Na⁺ exchange. The double stranded donor molecule was formed by annealing equimolar amounts of the two complementary single-stranded DNA oligomers using standard methods commonly known by one skilled in the art. Before delivery to Triticum aestivum, the double stranded DNA molecules were diluted to the required concentration in sterile water.

Design and Production of Binary Vector Encoding AHAS (S653N)

[0301] Standard cloning methods were used in the construction of binary vector pDAS000143 (SEQ ID:185) (Figure 15). The AHAS (S653N) gene expression cassette consists of the promoter, 5' untranslated region and intron from the Ubiquitin (Ubi) gene from Zea mays (Toki et al, (1992) Plant Physiology 100; 1503-07) followed by the coding sequence (1935 bp) of the AHAS gene from T. aestivum with base-pairs 1880 and 1181 mutated from CG to AT in order to induce an amino acid change from serine (S) to asparagine (N) at amino acid residue 653. The AHAS expression cassette included the 3' untranslated region (UTR) of the nopaline synthase gene (nos) from tumefaciens pTil5955 (Fraley et al, (1983) Proceedings of the National Academy of Sciences U.S.A. 80(15); 4803-4807). The selection cassette was comprised the promoter, 5' untranslated region and intron from the actin 1 (Actl) gene from Oryza sativa (McElroy et al, (1990) The Plant Cell 2(2); 163-171) followed by a synthetic, plant-optimized version of phosphinothricin acetyl transferase (PAT) gene, isolated from Streptomyces viridochromogenes, which encodes a protein that confers resistance to inhibitors of glutamine synthetase comprising phosphinothricin, glufosinate, and bialaphos (Wohlleben et al., (1988) Gene 70(1); 25-37). This cassette was terminated with the 3' UTR from the 35S gene of cauliflower mosaic virus (CaMV) (Chenault et al., (1993) Plant Physiology 101 (4); 1395-1396).

[0302] The selection cassette was synthesized by a commercial gene synthesis vendor (GeneArt, Life Technologies) and cloned into a Gateway-enabled binary vector with the RfA Gateway cassette located between the Ubiquitin (Ubi) gene from Zea mays and the 3' untranslated region (UTR) comprising the transcriptional terminator and polyadenylation site of the nopaline synthase gene (nos) from A.

tumefaciens pTil5955. The AHAS(S653N) coding sequence was amplified with flanking attB sites and sub-cloned into pDONR221. The resulting ENTRY clone was used in a LR CLONASE II™ (Invitrogen, Life Technologies) reaction with the Gateway-enabled binary vector encoding the phosphinothricin acetyl transferase (PAT) expression cassette. Colonies of E. coli cells transformed with all ligation reactions were initially screened by restriction digestion of miniprep DNA.

Restriction endonucleases were obtained from New England BioLabs and Promega. Plasmid preparations were performed using the QIAPREP SPIN MINIPREP KIT™ or the PURE YIELD PLASMID MAXIPREP SYSTEM™ (Promega Corporation, WI) following the manufacturer's instructions. Plasmid DNA of selected clones was sequenced using ABI Sanger Sequencing and BIG DYE TERMINATOR v3.1™ cycle sequencing protocol (Applied Biosystems, Life Technologies). Sequence data were assembled and analyzed using the SEQUENCHER SOFTWARE™ (Gene Codes Corporation, Ann Arbor, MI).

Biolistic-mediated transformation system for generating AHAS edited wheat Plants

[0303] About 23,000 scutella of immature zygotic embryos from the donor wheat line cv. Bobwhite MPB26RH were prepared for biolistics-mediated DNA delivery, as described previously. DNA-coated gold particles were prepared as described above with the following formulations. For transfections performed using pDAS000267, the donor DNA was mixed at a 5:1 molar ratio with plasmid DNA for pDAB109350 (encoding ZFNs 29732 and 29730). For transfections performed using pDAS000268, the donor DNA was mixed at a 10: 1 : 1 molar ratio with plasmid DNA for pDAB109350 (encoding ZFNs 29732 and 29730) and pDAB109360 (encoding ZFNs 30012 and 30018). Transfections performed using pDAS000143 were performed using gold particles that were coated only with plasmid DNA for pDAS000143.

[0304] Biolistic-mediated transfections were performed as described previously. A total of 15,620 scutella were bombarded with gold particles coated with DNA containing pDAS000267, a total of 7,310 scutella were bombarded with gold particles coated with DNA containing pDAS000268, and a total of 2,120 scutella were bombarded with gold particles coated with pDAS000143. Following

bombardment, the transfected scutella were incubated at 26°C in the dark for 16 h before being transferred onto medium for callus induction.

[0305] Four different chemical selection strategies based on IMAZAMOX® were used to enrich for regenerated wheat plants that had the S653N mutation precisely integrated into one or more homoeologous copies of the endogenous AJHAS gene by ZFN-mediated NHEJ-directed gene editing. The four chemical selection strategies are described in Table 9. For each strategy, scutella were cultured in the dark on callus induction medium at 24°C for 2 weeks. The resultant calli were sub- cultured once onto fresh callus induction medium and kept in the same conditions for a further two weeks. Somatic embryogenic callus (SEC) was transferred onto plant regeneration medium and cultured for 2 weeks at 24°C under a 16/8 (light/dark) hour photoperiod in a growth room. Regenerated plantlets were transferred onto rooting medium and cultured under the same conditions for 2-3 weeks. To increase stringency for the selection of regenerated plants having the S653N mutation, the roots of regenerated plants were removed and the plants were again sub-cultured on rooting media under the same conditions. Plantlets rooting a second time were transferred to soil and grown under glasshouse containment conditions. T₁ seed was harvested from individual plants, following bagging of individual spikes to prevent out-crossing.

[0306] The scutella explants bombarded with gold particles coated with pDAS000143 were used to monitor the selection stringency across the four chemical selection strategies for regenerating wheat plants carrying the AHAS S653N mutation. Plants transformed with pDAS000143 were regenerated using process described above.

Table 9: Chemical selection strategies used to regenerate wheat plants that had the S653N mutation precisely integrated into one or more homoeologous copies of the endogenous AHAS gene by ZFN-mediated NHEJ-directed gene editing.

[0307] Overall, 14 putatively ZFN-mediated NHEJ-directed AHAS edited wheat plants were recovered from the transfection of 22,930 scutella of immature zygotic embryos from the donor wheat line cv. Bobwhite MPB26RH. Putatively edited plants were obtained from all four selection strategies for scutella bombarded with gold particles coated with DNA containing pDAS000267. Two putatively edited plants were obtained from the second selection strategy for scutella bombarded with gold particles coated with DNA containing pDAS000268. A total of 129 putatively transformed wheat plants carrying at least one randomly integrated copy of the AHAS (S653N) donor polynucleotide were recovered across the four chemical selection strategies. Molecular Characterization of Edited Wheat Plants

[0308] The wheat plants resulting from bombardments with a donor polynucleotide encoding the S653N mutation were obtained and molecularly characterized to identify the wheat sub-genomes that comprised an integration of the S653N mutation that occurred as a result of the donor integration at a genomic double strand cleavage site. Two series of bombardments were completed. The first set of experiments was completed with pDAS000143, and the second set of experiments was completed with pDAS000267 and pDAS000268. Individual wheat plants were obtained from both sets of experiments and assayed via a molecular method to identify plants which contained an integrated copy of the AHAS donor polynucleotide encoding the S653N mutation.

[0309] A hydrolysis probe assay (analogous to the TAQMAN® based assay) for quantitative PCR analysis was used to confirm that recovered wheat plants that had been bombarded with pDAS000143 carried at least one randomly integrated copy of the AHAS donor polynucleotide encoding the S653N mutation. Confirmation via Sanger sequence analysis indicated that wheat plants recovered from bombardments performed with pDAS000267 and pDAS000268 comprised the S653N donor polynucleotide in at least one of the homoeologous copies of the AHAS gene at the position expected for ZFN-mediated NHEJ-directed gene editing. Genomic DNA isolation from regenerated wheat plants

[0310] Genomic DNA was extracted from freeze-dried leaf tissue harvested from each regenerated wheat plant. Freshly harvested leaf tissue was snap frozen in liquid nitrogen and freeze-dried for 24 h in a LABCONCO FREEZONE 4.5®

(Labconco, Kansas City, MO) at -40°C and 133 x 10^"3 mBar pressure. The lyophilized material was subj ected to DNA extraction using the DNEAS Y® PLANT DNA EXTRACTION MINI KIT™ (Qiagen) following the manufacturer's instructions. PCR assay to confirm random integration of AHAS donor polynucleotide encoding S653N mutation

[0311] To confirm that the regenerated wheat plants from bombardments performed with pDAS000143 carried at least one randomly integrated copy of the AHAS donor polynucleotide encoding the S653N mutation, a duplex hydrolysis probe qPCR assay (analogous to TAQMAN®) was used to amplify the endogenous single copy gene, puroindoline-b (Pinb), from the D genome of hexaploid wheat (Gautier et al, (2000) Plant Science 153, 81-91; SEQ ID NO:186, SEQ ID NO:187 and SEQ ID NO: 188 for forward and reverse primers and probe sequence, respectively) and a region of the Actin (Actl) promoter present on pDAS000143 (SEQ ID NO:189, SEQ ID NO: 190 and SEQ ID NO: 191 for forward and reverse primers and probe sequence, respectively). Hydrolysis probe qPCR assays were performed on 24 randomly chosen wheat plants that were recovered from each of the four chemical selection strategies. Assessment for the presence, and estimated copy number of pDAS00143 was performed according to the method described in Livak and Schmittgen (2001) Methods 25(4):402-8.

[0312] From the results, conclusive evidence was obtained for the integration of at least one copy of the AHAS donor polynucleotide encoding the S653N mutation into the genome of each of the wheat plants tested. These results indicate that the four chemical selection strategies provided stringent selection for the recovery of plants expressing the S653N mutation.

PCR assay of genomic DNA for ZFN-mediated AHAS editing

[0313] To characterize the sub-genomic location and outcome of ZFN- mediated NHEJ-directed gene editing in the recovered wheat plants, PCR with primers AHASJFl and AHAS_3R1 (SEQ ID NO:192 and SEQ ID NO:193) was used to amplify the target region from the homoeologous copies of the AHAS genes. The resulting PCR products were cloned into plasmid vector and Sanger sequenced using BIGDYE® v3.1 chemistry (Applied Biosystems) on an ABI3730XL® automated capillary electrophoresis platform. Sanger sequencing of up to 120 independent plasmid clones was performed to ensure that each allele at the endogenous AHAS homoeologs was sequenced. Sequence analysis performed using SEQUENCHER SOFTWARE™ was used to generate a consensus sequence for each allele of the three homoeologous copies of the AHAS gene in each of the recovered wheat plants, and to determine the sub-genomic origin and sequence for each edited allele.

[0314] From the results, conclusive evidence for precise ZFN-mediated NHEJ-directed gene editing at the endogenous AHAS loci was demonstrated for 11 of the 12 recovered wheat plants that were transformed using pDAB109350 and pDAS000267 (Table 10), and both of the recovered wheat plants that were transformed using pDAB109350, pDAB109360 and pDAS000268 (Table 11). Plants with a range of editing outcomes were observed including: (1) independent events with perfect sub-genome-specific allele edits; (2) events with single perfect edits in the A-genome, B-genome and D-genomes; (3) events with simultaneous editing in multiple sub-genomes; and, (4) events demonstrating hemizygous and homozygous sub-genome-specific allele editing. Disclosed for the first time is a method which can be utilized to mutate a gene locus within all three genomes of a wheat plant. Wheat plants comprising an integrated AHAS donor polynucleotide encoding a

S653N mutation are exemplified; integration of the polynucleotide sequence provides tolerance to imidazolinone class herbicides. The utilization of ZFN-mediated genomic editing at an endogenous gene locus in wheat allows for the introduction of agronomic traits (via mutation) without time consuming wheat breeding techniques which require backcrossing and introgression steps that can increase the amount of time required for introgressing the trait into all three sub-genomes. Consensus Sanger sequences for the alleles present in each sub-genome for the edited wheat plants are provided as SEQ ID NO: 194-277 in Tables 10 and 11.

Table 10: ZFN-mediated NHEJ-directed AHAS editing outcomes for wheat

plants transformed using pDAB 109350 and pDAS000267

No.4 No.

6 11 44 30 6 11

clones¹

Status PE UE NHEJ UE UE nd

Plant

No. 218-223 No.5 10 9 15 26 21 0

clones¹

Status UE nd PE UE UE nd

Plant

No. 224-229 No.6 22 0 1 1 18 43 0

clones¹

Status PE UE UE nd UE nd

Plant

No. 230-235 No.7 5 12 26 0 22 0

clones¹

Status UE nd UE nd UE nd

Plant

No. 236-241 No.8 32 0 40 0 26 0

clones¹

Status PE nd IE UE UE nd

Plant

No. 242-247 No.9 24 0 13 21 33 0

clones¹

Status PE UE UE nd UE nd

Plant

No. 248-253 No.10 10 19 37 0 29 0

clones¹

Status UE nd UE nd PE UE

Plant

No. 254-259 No.11 35 0 37 0 15 11

clones¹

Status UE nd UE nd IE NHEJ

Plant

No. 260-265 No.12 34 0 40 0 14 8

clones¹

PE = perfect edit; i. e. , ZFN-mediated NHEJ-directed genome editing produced a predicted outcome.

IE = imperfect edit; i.e., ZFN-mediated NHEJ-directed genome editing produced an unpredicted outcome.

UE = unedited allele; i.e., allele had wild-type sequence,

nd = not detected; i. e. , sufficient independent plasmid clones were sequenced to conclude that an alternate allele was not present and that the locus was homozygous for a single allele.

NHEJ = Non Homologous End Joining; i.e., evidence for a non-homologous end joining DNA repair outcome that did not result in the integration of a donor molecule at the ZFN cleavage site.

Table 11: ZFN-mediated NHEJ-directed AHAS editing outcomes for wheat

No.l3a No.

10 12 49 0 18 0

clones¹

Number of independent plasmid clones sequenced.

IE = imperfect edit; i.e., ZFN-mediated NHEJ-directed genome editing produced unexpected outcome.

UE = unedited allele; i.e. , allele had wild-type sequence.

nd = not detected; i.e., sufficient independent plasmid clones were sequenced to conclude that an alternate allele was not present and that the locus was homozygous for a single allele.

Design of zinc finger binding domains specific to region in AHAS genes encoding the P197 amino acid residue

[0315] Zinc finger proteins directed against DNA sequence of the

homoeologous copies of the AHAS genes were designed as previously described. Exemplary target sequence and recognition helices are shown in Table 12

(recognition helix regions designs) and Table 13 (target sites). In Table 13, nucleotides in the target site that are contacted by the ZFP recognition helices are indicated in uppercase letters; non-contacted nucleotides are indicated in lowercase. Zinc Finger Nuclease (ZFN) target sites were designed upstream (from 2 to 510 nucleotides upstream) of the region in the AHAS gene encoding the proline 197

(PI 97) amino acid residue.

Table 13: Tar et site of AHAS zinc fin ers

[0316] The AHAS zinc finger designs were incorporated into zinc finger expression vectors and verified for cleavage activity using a budding yeast system, as described previously. Of the numerous ZFNs that were designed, produced and tested to bind to the putative AHAS genomic polynucleotide target sites, the ZFNs described above were identified as having in vivo activity at high levels, and selected for further experimentation. These ZFNs were designed to bind to the three homoeologous AHAS and were characterized as being capable of efficiently binding and cleaving the unique AHAS genomic polynucleotide target sites in planta.

[0317] ZFN Construct Assembly; Plasmid vectors containing ZFN expression constructs verified for cleavage activity using the yeast system were designed and completed as previously described. The resulting plasmid constructs; pDABl 11855 (ZFNs 34470-2A-34471), pDABl 11856 (ZFNs 34472-2A-34473), and pDABl 11857 (ZFNs 34474-2A-34475) were confirmed via restriction enzyme digestion and via DNA sequencing.

Preparation of DNA from ZFN constructs for transfection

[0318] Before delivery to Triticum aestivum protoplasts, plasmid DNA for each ZFN construct was prepared from cultures of E. coli using the PURE YIELD PLASMID MAXIPREP SYSTEM® (Promega Corporation, Madison, WI) or PLASMID MAXI KIT® (Qiagen, Valencia, CA) following the instructions of the suppliers.

Isolation and transfection of wheat mesophyll protoplasts

[0319] Mesophyll protoplasts from the donor wheat line cv. Bobwhite

MPB26RH were prepared and transfected using polyethylene glycol (PEG)-mediated DNA delivery as previously described. PCR assay of protoplast genomic DNA for ZFN sequence cleavage

[0320] Genomic DNA was isolated from transfected protoplasts and used for

PCR assays to assess the cleavage efficiency and target site specificity of ZFNs designed to the region of the AHAS gene encoding PI 97, as previously described. Five sets of PCR primers which contained a phosphorothioate linkage as indicated by the asterisk [*] were used to amplify the ZFN target site loci (Table 14). Each primer set was designed according to criteria previously described.

Table 14: Primer sequences used to assess AHAS ZFN cleavage efficacy and tar et site specificity.

Data analysis for detecting NHEJ at target ZFN sites

[0321] Following generation of Illumina short read sequence data for sample libraries prepared for transfected mesophyll protoplasts, bioinformatics analysis (as previously described) was performed to identify deleted nucleotides at the target ZFN sites. Such deletions are known to be indicators of in planta ZFN activity that result from non-homologous end joining (NHEJ) DNA repair.

[0322] Two approaches were used to assess the cleavage efficiency and specificity of the ZFNs tested. Cleavage efficiency was expressed (in parts per million reads) as the proportion of sub-genome assigned sequences that contained a NHEJ deletion at the ZFN target site (Table 15). Rank ordering of the ZFNs by their observed cleavage efficiency was used to identify ZFNs with the best cleavage activity for the target region of the AHAS genes in a sub-genome-specific manner. All of the ZFNs tested showed NHEJ deletion size distributions consistent with that expected for in planta ZFN activity. Cleavage specificity was expressed as the ratio of cleavage efficiencies observed across the three sub-genomes.

Table 15: ZFN cleavage efficacy (expressed as number of NHEJ events per

[0323] From these results, the ZFNs encoded on plasmids pDABl 11855

(34470-2A-34471), pDABl l 1856 (34472-2A-34473) and pDABl 11857 (34474-2A- 34475) were selected for in planta targeting in subsequent experiments, given their characteristics of significant genomic DNA cleavage activity in each of the three wheat sub-genomes.

Generation of molecular evidence for ZFN-mediated, exogenous marker-free sequential transgene stacking at an endogenous AHAS locus using transient assays

[0324] The generation of molecular evidence using transient assays for ZFN- mediated, sequential exogenous marker-free transgene stacking at an endogenous AHAS locus within the genome of Triticum aestivum cells via homology directed DNA repair is achieved as follows.

[0325] The AHAS (S653N) edited wheat plants, which were produced via transformation with donor pDAS000267 and the Zinc Finger Nuclease encoded on plasmid pDAB109350, demonstrate the first step for sequential, exogenous marker- free transgene stacking at an endogenous AHAS locus in the genome of wheat. These edited plants are used to generate explant material {e.g., protoplasts or scutella of immature zygotic embryos) for transfection using the previously described methods. The explant material is subsequently co-transfected with a donor DNA molecule and a plasmid encoding a ZFN (e.g. , pDAB 111855, pDAB 111856 or pDAB 111857) that is designed to target a Zinc Finger binding site located in the AHAS genes upstream of the region encoding the PI 97 amino acid residue. The ZFN cleaves an AHAS locus and the donor molecule is integrated within the genome of Triticum aestivum cells via homology directed repair. As a result of NHEJ-mediated donor molecule integration, the AHAS(P197S) mutation conferring tolerance to sulfonylurea class herbicides is introduced into the endogenous AHAS sequence and simultaneously, the

AHAS(S653N) mutation introduced in the first round of transgene stacking is removed. Consequently, the expression of the endogenous AHAS gene is changed from conferring tolerance to imidazolinones and susceptibility to sulfonylureas (the phenotype of correctly targeted wheat cells in the first round of transgene stacking) to conferring susceptibility for imidazolinones and tolerance for sulfonylureas, thus allowing for the regeneration of correctly targeted cells using a sulfonylurea selection agent. Molecular evidence for the integration of the donor DNA and generation of correctly targeted wheat cells is confirmed using the previously described methods.

[0326] It is appreciated by those skilled in the art that co-transformation of wheat cells with a donor DNA molecule that contains one or more transgenes and a plasmid encoding a Zinc Finger Nuclease enables both parallel (simultaneous) or sequential transgene integration (transgene stacking) in plant genomes at precisely the same genomic location, including simultaneous editing of multiple alleles across multiple genomes in polyploid plant species.

Development of a transformation system for sequential, exogenous marker-free transgene stacking at the endogenous AHAS loci in wheat

[0327] The endogenous AHAS gene in wheat was selected as a model locus to develop a ZFN-mediated, exogenous marker-free transformation system for generating plants with one or more transgenes precisely positioned at the same genomic location. The transformation system enables parallel (simultaneous integration of one or more transgenes) or sequential stacking (consecutive integration of one or more transgenes) at precisely the same genomic location, including simultaneous parallel or sequential stacking at multiple alleles across multiple sub- genomes, by exploiting known mutations in the AHAS gene that confer tolerance to Group B herbicides. ZFN-mediated integration of a donor DNA into the wild-type (herbicide susceptible) AHAS locus is used to introduce transgene(s) and a mutation to the endogenous AHAS gene that confers tolerance to imidazolinones, thus allowing the regeneration of correctly targeted plants using an imidazolinone selection agent. Stacking of a second transgene(s) at the AHAS locus is achieved by integration of a donor DNA that introduces one or more additional transgenes and confers

susceptibility to imidazolinones but tolerance to sulfonylureas, thus allowing the regeneration of correctly targeted plants using a sulfonylurea selection agent. Stacking of a third transgene can be achieved by integration of a donor molecule that introduces further transgene(s) and confers susceptibility to sulfonylurea and tolerance to imidazolinones, thus allowing the regeneration of correctly targeted plants using an imidazolinone selection agent. As such, continued rounds of sequential transgene stacking are possible by the use of donor DNA that introduce transgene(s) and mutations at the endogenous AHAS genes for differential cycling between imidazolinone and sulfonylurea selection agents. The transgenes can be integrated within the AHAS gene and stacked via an NHEJ pathway. The NHEJ repair and recombination pathway can be determined by the design of the donor transgene. In an embodiment, transgenes that are integrated and stacked within the AHAS gene would be designed to contain single or double cut ZFN sites that flank the payload (e.g., AHAS mutation and gene of interest). Accordingly, such a design would utilize an NHEJ pathway for the integration and stacking of the donor polynucleotide within the chromosome.

Generation of low-copy, randomly integrated T-DNA wheat plants with

AHAS(P197S) expression constructs

[0328] A binary vector pDAS000164 (SEQ ID NO:326, Figure 16) containing the AHAS(P197S) expression and PAT selection cassettes was designed and assembled using skills and techniques commonly known in the art. The AHAS (P197S) expression cassette consisted of the promoter , 5' untranslated region and intron from the Ubiquitin (Ubi) gene from Zea mays (Toki et al, (1992) Plant Physiology, 100;1503-07) followed by the coding sequence (1935 bp) of the AHAS gene from T. aestivum cv. Bobwhite MPB26RH with nucleotide 511 mutated from C to T in order to induce an amino acid change from proline (P) to serine (S). The AHAS expression cassette included the 3' untranslated region (UT ) comprising of the nopaline synthase gene (nos) from A. tumefaciens pTil5955 (Fraley et al, (1983) Proceedings of the National Academy of Sciences U.S.A. 80(15): 4803-4807). The selection cassette was comprised of the promoter, 5' untranslated region and intron from the actin l(Act\) gene from Oryza sativa (McElroy et al., (199) The Plant Cell 2(2) : 163 - 171 ) followed by a synthetic, plant-optimized version of phosphinothricin acetyl transferase (PAT) gene, isolated from Streptomyces viridochromogenes, which encodes a protein that confers resistance to inhibitors of glutamine synthetase comprising phosphinothricin, glufosinate, and bialaphos (Wohlleben et al, (1988) Gene, 70(1): 25-37). This cassette was terminated with the 3' UTR from the 35S gene of cauliflower mosaic virus (CaMV) (Chenault et al, (1993) Plant Physiology 101 (4): 1395-1396).

[0329] The selection cassette was synthesized by a commercial gene synthesis vendor (e.g., GeneArt, Life Technologies, etc.) and cloned into a GATEWAY®- enabled binary vector with the RfA Gateway cassette located between the Ubiquitin (Ubi) gene from Zea mays and the 3' untranslated region (UTR) comprising the transcriptional terminator and polyadenylation site of the nopaline synthase gene (nos) from A. tumefaciens pTil5955. The AHAS (P197S) coding sequence was amplified with flanking attB sites and sub-cloned into pDONR221. The resulting ENTRY clone was used in a LR CLONASE II® (Invitrogen, Life Technologies) reaction with the Gateway-enabled binary vector encoding the phosphinothricin acetyl transferase

(PAT) expression cassette. Colonies of all assembled plasmids were initially screened by restriction digestion of miniprep DNA. Restriction endonucleases were obtained from New England BioLabs (NEB; Ipswich, MA) and Promega (Promega

Corporation, WI). Plasmid preparations were performed using the QIAPREP SPIN MINIPREP KIT® (Qiagen, Hilden) or the PURE YIELD PLASMID MAXIPREP SYSTEM® (Promega Corporation, WI) following the instructions of the suppliers. Plasmid DNA of selected clones was sequenced using ABI Sanger Sequencing and BIG DYE TERMINATOR V3.1® cycle sequencing protocol (Applied Biosystems, Life Technologies). Sequence data were assembled and analyzed using the

SEQUENCHER™ software (Gene Codes Corporation, Ann Arbor, MI).

[0330] The resulting binary expression clone pDASOOOl 64 was transformed into Agrobacterium tumefaciens strain EHA105. Transgenic wheat plants with randomly integrated T-DNA were generated by Agrobacterium-med ted transformation using the donor wheat line cv. Bobwhite MPB26RH, following a protocol similar to Wu et al. (2008) Transgenic Research 17:425-436. Putative To transgenic events expressing the AHAS (PI 97) expression constructs were selected for phosphinothricin (PPT) tolerance, the phenotype conferred by the PAT selectable marker, and transferred to soil. The To plants were grown under glasshouse containment conditions and T₁ seed was produced.

[0331] Genomic DNA from each To plant was extracted from leaf tissue, as previously described, and tested for the presence of Agrobacterium tumefaciens and for the number of integrated copies of the T-DNA encoding AHAS(P197S). The presence of A. tumefaciens was performed using a duplex hydrolysis probe qPCR assay (analogous to TAQMAN™) to amplify the endogenous ubiquitin gene (SEQ ID NO:327, SEQ ID NO:328, and SEQ ID NO:329 for forward and reverse primers and probe sequence, respectively) from the wheat genome, and virC from pTiBo542 (SEQ ID NO:330, SEQ ID NO:331, and SEQ ID NO:332 for forward and reverse primers and probe sequence, respectively). The number of integrated T-DNA copies was estimated using a duplex hydrolysis probe qPCR assay, as previously described, based on the puroindoline-b (Pinb) from the D genome of hexaploid wheat and a region of the Actin (Actl) promoter present on pDAS000164. Overall, 35 independent T₀ events with fewer than three randomly integrated copies of T-DNA were generated.

Optimization of chemical selection conditions based on sulfometuron methyl

[0332] A series of experiments were performed to determine optimal selection conditions for regenerating wheat plants expressing the AHAS(P197S) mutation conferring tolerance to sulfonylurea class herbicides. These experiments were based on testing the basal tolerance of the wild-type donor wheat line cv. Bobwhite

MPB26RH (P197/P197 genotype, which confers susceptibility to sulfonylureas) at the callus induction, plant regeneration and rooting stages of an established wheat transformation system. Similar experiments were performed to determine the basal tolerance of transgenic cv. Bobwhite MPB26RH events that had randomly integrated T-DNA expressing the AHAS(P197) mutation, which confers tolerance to

sulfonylurea selection agents.

[0333] The basal tolerance of the wild-type donor wheat line to sulfometuron methyl at the callus induction stage was determined as follows: Scutella of immature zygotic embryos were isolated, as previously described, and placed in 10 cm

PETRI™ dishes containing CIM medium supplemented with 0, 100, 500, 1000, 1500 and 2000 nM sulfometuron methyl, respectively. Twenty scutella were placed in each PETRI™ dish. A total of 60 scutella were tested at each sulfometuron methyl concentration. After incubation at 24°C in the dark for 4 weeks, the amount of somatic embryogenic callus formation (SEC) at each sulfometuron methyl concentration was recorded. The results showed that SEC transformation for cv. Bobwhite MPB26RH was reduced by about 70% at 100 nM sulfometuron methyl, compared to untreated samples.

[0334] The basal tolerance of the wild-type donor wheat line to sulfometuron methyl at the plant regeneration stage was determined as follows: Scutella of immature zygotic embryos from the donor wheat line were isolated and placed in 10 cm PETRI™ dishes containing CIM medium. Somatic embryogenic callus was allowed to form by incubating at 24°C in the dark for 4 weeks. The SEC was transferred to 10 cm PETRI™ dishes containing DRM medium supplemented with 0, 100, 500, 1000, 1500, 2000, 2500 and 3000 nM sulfometuron methyl, respectively. Twenty CIM were placed in each PETRI™ dish. A total of 60 CIM were tested for basal tolerance response at each sulfometuron methyl concentration. After incubation for 2 weeks at 24°C under a 16/8 (light/dark) hour photoperiod in a growth room, the regeneration response was recorded. The results showed that plant regeneration was reduced by about 80% at 2000 nM sulfometuron methyl, compared to untreated samples.

[0335] The basal tolerance of the wild-type donor wheat line to sulfometuron methyl at the plant rooting stage was determined as follows: Scutella of immature zygotic embryos were isolated and placed in 10 cm PETRI™ dishes containing CIM medium. Somatic embryogenic callus was allowed to form by incubating at 24°C in the dark for 4 weeks. The SEC was transferred to 10 cm PETRI™ dishes containing DRM medium and incubated for 2 weeks at 24°C under a 16/8 (light/dark) hour photoperiod to allow plant regeneration to take place. Regenerated plants were transferred to 10 cm PETRI™ dishes containing RM medium supplemented with 0, 100, 200, 250, 300, 400, 500, 1000 and 2000 nM sulfometuron methyl, respectively. Ten regenerated plants were placed in each PETRI™ dish. A total of 30 regenerated plants were tested for basal tolerance response at each sulfometuron methyl concentration. After incubation for 3 weeks at 24°C under a 16/8 (light/dark) hour photoperiod in a growth room, the root formation response was recorded. The results showed that root formation was severely inhibited when concentrations of

sulfometuron methyl higher than 400 nM, compared to untreated samples.

[0336] The basal tolerance of transgenic wheat events with randomly integrated, low-copy (< 3) T-DNA expressing the AHAS(P197S) mutation to sulfometuron methyl at the plant rooting stage was determined as follows: Four independent transgenic events were randomly selected and multiplied in vitro by sub- culturing on multiplication medium. Following multiplication, plants for each event were transferred to 10 cm PETRI™ dishes containing RM medium supplemented with 0, 400, 450, 500, 550 and 600 nM sulfometuron methyl, respectively. Four plants (one from each of the four events) were placed in each PETRI™ dish. A total of 3 plants per event was tested for basal tolerance at each sulfometuron methyl concentration. After incubation for 2 weeks at 24°C under a 16/8 (light/dark) hour photoperiod in a growth room, the root formation response was recorded. The results showed that root formation was not restricted, compared to untreated controls, at any of the concentrations tested, indicating that the AHAS(P197S) mutation conferred high tolerance to sulfometuron methyl. Design and synthesis of donor DNA for first sequential transgene stacking at an endogenous AHAS locus using NHEJ-directed DNA repair

[0337] The donor DNA for the first round of transgene stacking is designed to promote precise donor integration at an endogenous AHAS locus via ZFN-mediated, NHEJ-directed repair. The design is based on the integration of a double stranded donor molecule at the position of the double strand DNA break created by cleavage of a homoeologous copy of the endogenous AHAS gene by ZFNs 29732 and 29730 (encoded on plasmid pDAB109350). The donor molecule (pDAS000433; SEQ ID NO:333, Figure 17) several portions of polynucleotide sequences. The 5' end contains sequence near identical to the endogenous AHAS gene encoded in the D-genome, starting from the target ZFN cleavage site and finishing at the AHAS stop codon. Seven deliberate mutations are introduced into this sequence: two mutations encode the S653N mutation and five codon-optimized, synonymous mutations positioned across the binding site of ZFN 29732 to prevent re-cleavage of the integrated donor. Following the stop codon is 316-bp of non-coding sequence corresponding to the conserved 3 'untranslated region (3'UTR) across the AHAS homoeologs. The 3'UTR sequence is followed by Zinc Finger binding sites for ZFNs 34480 and 34481 (encoded on plasmid pDABl 11860) and ZFNs 34482 and 34483 (encoded on plasmid pDAB 111861). These Zinc Finger binding sites allow for self-excision of donor- derived AHAS (coding and 3'UTR) sequence integrated at the endogenous locus during the next round of transgene stacking. The self-excision Zinc Finger binding sites are followed by several additional Zinc Finger binding sites (each of which is separated by 100-bp of random sequence) that flank two unique restriction

endonuclease cleavage sites and which enable the insertion of a transgene expression cassette (e.g. the PAT expression cassette, as described previously) into the donor molecule. The additional Zinc Finger binding sites enable future excision of transgenes integrated at an AHAS locus by sequential marker-free transgene stacking, or continued sequential transgene stacking at the same genomic location using an alternate stacking method.

[0338] The donor cassette is synthesized by a commercial gene service vendor

(e.g., GeneArt, Life Sciences, etc.) with a short stretch of additional flanking sequence at the 5' and 3' ends to enable generation of a donor molecule with protruding 5' and 3' ends that are compatible with the ligation overhangs generated by ZFNs 29732 and 29730 (encoded on plasmid pDAB109350) upon cleavage of an endogenous AHAS locus. The donor molecule with protruding 5' and 3' ends is generated by digesting plasmid DNA containing the donor molecule with the restriction endonuclease Bbsl using standard methods known to the person having skill in the art. Design and synthesis of donor DNA for second sequential transgene stack at an endogenous AHAS locus using NHEJ-directed DNA repair

[0339] The donor DNA for the second round of transgene stacking is designed to promote precise donor integration at the same AHAS locus targeted in the first transgene stack via ZFN-mediated, NHEJ-directed repair. The design is based on the integration of a double stranded donor molecule at the double strand DNA break created by cleavage of the AHAS gene copy containing the first stacked transgene by ZFNs 34480 and 34481 (encoded on plasmid pDABl 11860) or ZFNs 34482 and 34483 (encoded on plasmid pDABl 11861). The donor molecule (pDAS000434; SEQ ID NO:334, Figure 18) comprises several portions of polynucleotide sequences. The 5' end contains sequence near identical to the endogenous AHAS gene encoded in the D-genome, starting from the target ZFN cleavage site and finishing at the AHAS stop codon. Several deliberate mutations are introduced into this sequence: mutations encoding the P197S mutation and codon-optimized, synonymous mutations positioned across the binding site of ZFNs 34481 and 34483 to prevent re-cleavage of the integrated donor. Following the stop codon is 316-bp of non-coding sequence corresponding to the conserved 3 'untranslated region (3'UTR) in the AHAS homoeologs. The 3'UTR sequence is followed by Zinc Finger binding sites for ZFNs 34474 and 34475 (encoded on plasmid pDAB 111857) and ZFNs 34476 and 34477 (encoded on plasmid pDABl 11858). These Zinc Finger binding sites allow for self- excision of donor-derived AHAS (coding and 3'UTR) sequence integrated at an endogenous locus in the next round of transgene stacking. The self-excision Zinc Finger binding sites are followed by several additional Zinc Finger binding sites (each of which is separated by 100-bp of random sequence) that flanks unique restriction endonuclease cleavage sites and which enable insertion of a transgene expression cassette (e.g. the DGT-28 expression cassette, as described in Patent Application Number 13757536). The additional Zinc Finger binding sites enable future excision of transgenes which can be integrated at an AHAS locus by sequential marker-free transgene stacking, or continued sequential transgene stacking at the same genomic location using an alternate stacking method. The donor cassette is synthesized by a commercial gene service vendor (e.g., Gene Art, Life Sciences) with a short stretch of additional flanking sequence at the 5' and 3' ends to enable generation of a donor molecule with protruding 5' and 3' ends that are compatible with the ligation overhangs generated by ZFNs 34474 and 34475 (encoded on plasmid pDAB 111857) or ZFNs 34476 and 34477 (encoded on plasmid pDABl 11858), upon cleavage of an endogenous AHAS locus. The donor molecule with protruding 5' and 3' ends is generated by digesting plasmid DNA containing the donor molecule with the restriction endonuclease Bbsl using standard methods known to one in the art. Transformation system for exogenous marker-free, sequential transgene stacking at an endogenous AHAS locus in wheat using NHEJ-directed DNA repair

[0340] Transgenic wheat events with multiple transgenes stacked at the same endogenous AHAS locus are produced by exogenous marker-free, sequential transgene stacking via transformation with donor pDAS000433 and ZFNs 29732 and 29730 (encoded on plasmid pDAB109350). Precise ZFN-mediated, NHEJ-directed donor integration introduces the first transgene and S653N mutation conferring tolerance to imidazolinones at an AHAS locus, thus allowing for the regeneration of correctly targeted plants using IMAZAMOX® as a selection agent, as previously described. Figure 19a depicts the integration. Subsequent transformation of wheat cells, derived from first transgene stacked events, with donor pDAS000434 and ZFNs 34480 and 34481 (encoded on plasmid pDABl 11860) results in the replacement of the endogenous chromatin located between the ZFN binding sites positioned upstream of P 197 and at the self-excision site integrated during the first transgene stack with the donor molecule. This results in integration of the second transgene and a P197S mutation conferring tolerance to sulfonylurea, thus allowing for the regeneration of correctly targeted plants using sulfometuron methyl as a selection agent. At the same time, integration of the second donor removes the S653N mutation, thus restoring susceptibility to imidazolinones (Figure 19B). One skilled in the art will appreciate that stacking of a third transgene can be achieved by transformation with appropriate zinc finger nucleases and a donor that contains an additional transgene and confers susceptibility to sulfonylurea and tolerance to imidazolinones, thus allowing the regeneration of correctly targeted plants using IMAZAMOX® as a selection agent. As such, continued rounds of sequential transgene stacking are possible via transformation with donors that introduce transgenes and mutations in the endogenous AHAS genes for differential cycling between imidazolinone and sulfonylurea selection agents. Design and synthesis of donor DNA for first sequential transgene stacking at an endogenous AHAS locus using HDR-directed DNA repair

[0341] The donor DNA for the first round of transgene stacking is designed to promote precise donor integration at an endogenous AHAS locus via ZFN-mediated repair. The design is based on the integration of a double stranded donor molecule at the position of the double strand DNA break created by cleavage of a homoeologous copy of the endogenous AHAS gene by ZFNs 29732 and 29730 (encoded on plasmid pDAB109350). The donor molecule (pDAS000435; SEQ ID NO: 335, Figure 21) is identical in sequence to pDAS000433 (SEQ ID NO:333).

Transformation system for exogenous marker-free, sequential transgene stacking at an endogenous AHAS locus in wheat using HDR-directed DNA repair

[0342] Transgenic wheat events with multiple transgenes stacked at the same endogenous AHAS locus are produced by exogenous marker-free, sequential transgene stacking via transformation with donor pDAS000435 and ZFNs 29732 and 29730 (encoded on plasmid pDAB109350). Precise ZFN-mediated, HDR-directed donor integration introduces the first transgene and S653N mutation conferring tolerance to imidazolinones at an AHAS locus, thus allowing for the regeneration of correctly targeted plants using IMAZAMOX® as a selection agent, as previously described. Figure 20a depicts the integration. Subsequent transformation of wheat cells, derived from first transgene stacked events, with donor pDAS000436 and ZFNs 34480 and 34481 (encoded on plasmid pDABl 11860) results in the replacement of the endogenous chromatin located between the ZFN binding sites positioned upstream of PI 97 and at the self-excision site integrated during the first transgene stack with the donor molecule. This results in integration of the second transgene and a P197S mutation conferring tolerance to sulfonylurea, thus allowing for the regeneration of correctly targeted plants using sulfometuron methyl as a selection agent. At the same time, integration of the second donor removes the S653N mutation, thus restoring susceptibility to imidazolinones (Figure 20b). As will be obvious to one skilled in the art, stacking of a third transgene can be achieved by transformation with appropriate zinc finger nucleases and a donor that contains an additional transgene and confers susceptibility to sulfonylurea and tolerance to imidazolinones, thus allowing the regeneration of correctly targeted plants using IMAZAMOX® as a selection agent. As such, continued rounds of sequential transgene stacking are possible via transformation with donors that introduce transgenes and mutations in the endogenous AHAS genes for differential cycling between imidazolinone and sulfonylurea selection agents.

Artificial crossing and molecular analysis to recover transgenic plants with specific combinations of precise genome modifications

[0343] The Triticum aestivum events which are produced via transformation with donor DNA and zinc finger nuclease constructs result in the integration of donor molecule sequence at one or more copies the target endogenous locus. As shown previously, ZFN-mediated genome modification can include simultaneous editing of multiple alleles across multiple sub-genomes. Artificial crossing of transformation events can be subsequently used to select for specific combinations of precise genome modifications. For example, artificial crossing of transformation events produced that have precisely modified AHAS genes with the S653N mutation can be used to produce wheat plants that have the S653N mutation on a specific sub-genome, on multiple sub-genomes, or on all three sub-genomes. Subsequent artificial crossing of transformation events facilitates the generation plants that have specific combinations of precise genome modifications. One skilled in the art can deploy molecular assays, such as those previously described, to track the inheritance of specific genome modification during artificial crossing in subsequent generations.

Example 7: Targeted integration into and disruption of Brassica napus omega-3 fatty acid desaturase (Fad3)

Selection of zinc finger binding domains specific to Fad3C and Fad3 A

[0344] The transcribed regions for homoeologous Fad3 genes were identified and characterized, zinc finger nucleases that were designed to bind and cleave these sites for NHEJ-mediated targeting of a donor sequence as described herein were designed and constructed. See, U.S. Provisional Patent Filing No. 61/697,854, herein incorporated by referenced. Zinc finger proteins (ZFPs) directed against DNA sequences from homeologues of Fad3 sequences were designed and tested as previously described in the U.S. Provisional Patent Filing No. 61/697,854. From the ZFNs showing on-target activity, two zinc finger proteins were selected that cut the Fad3 target at high efficiency: ZFP 28051-2A-28052 recognizes SEQ ID NO:336 5'- GCCCAAGGAACCCTTTTCTGGGCCATCTTCGTACTCGGCCACGACTGGTAA TTTAAT-3' and was previously shown to specifically bind and cleave the Fad3C genomic locus. Likewise Zinc finger protein 28053 -2A-28054 recognizes SEQ ID NO:337 5'-

AGCGAGAGAAAGCTTATTGCAACTTCAACTACTTGCTGGTCGATCGTGTTG GCCACTC-3' and was previously shown to specifically bind and cleave the Fad3A and Fad3C genomic locus. Nucleotides in the target sites that are contacted by the ZFP recognition helices are shown in Table 16.

Table 16: Zinc Finger Protein Binding Sites specific to Fad3C (28051-2A-28052) or Fad3A and Fad3C (28053-2A-28054). Nucleotides in the target site that are contacted by the ZFP recognition helices are indicated in uppercase letters; non- contact nucleotides are indicated in lowercase. Nucleotides in copies of Fad3 that

differ from Fad3C are identified by underlining.

Design and construction of expression vectors encoding zinc finger nucleases specific to Fad3C and Fad3A

[0345] The Fad3 zinc finger designs were incorporated into zinc finger expression vectors encoding a protein having at least one finger with a CCHC structure (U.S. Patent Publication No. 2008/0182332). In particular, the last finger in each protein had a CCHC backbone for the recognition helix. The non-canonical, zinc finger-encoding-sequences were fused to the nuclease domain of the type IIS restriction enzyme Fokl (amino acids 384-579 of the sequence of Wah et al., (1998) Proc. Natl. Acad. Sci. USA 95:10564-10569) via a four amino acid ZC linker and a sop2 nuclear localization signal. The self-hydrolyzing 2A encoding nucleotide sequence from Thosea asigna virus (Szymczak et al., 2004) was added between the two Zinc Finger Nuclease fusion proteins. Expression of the ZFNs was driven by the strong constitutive promoter and 5' untranslated region (UTR) from Cassava Vein Mosaic Virus (Verdaguer et al, Plant Molecular Biology 1996, 31(6); 1129-1139) and flanked by the 3' UTR (including the transcriptional terminator and polyadenylation site) from open reading frame 23 (ORF23) of Agrobacterium tumefaciens pTil5955 (Barker et al., Plant Molecular Biology 1983, 2(6); 335-50).

[0346] The vectors were assembled using the IN-FUSION™ Advantage

Technology (Clontech, Mountain View, CA). Restriction endonucleases were obtained from New England BioLabs (NEB; Ipswich, MA) and T4 DNA Ligase

(Invitrogen) was used for DNA ligation. Plasmid preparations were performed using NUCLEOSPIN® Plasmid Kit (Macherey-Nagel Inc., Bethlehem, PA) or the

PLASMID MIDI KIT™ (Qiagen) following the instructions of the suppliers. DNA fragments were isolated using QIAQUICK GEL EXTRACTION KIT™ (Qiagen) after agarose Tris-acetate gel electrophoresis. Colonies of assembled plasmids were initially screened by restriction digestion of miniprep DNA. Plasmid DNA of selected clones was sequenced by a commercial sequencing vendor (Eurofins MWG Operon, Huntsville, AL). Sequence data were assembled and analyzed using the

SEQUENCHER™ software (Gene Codes, Ann Arbor, MI). The resulting plasmid constructs: pDAB 107827 (ZFN 28051-2A-28052, Figure 22, SEQ ID NO:350) and pDAB107828 (ZFN 28053-2A-28054, Figure 23, SEQ ID NO:351) were confirmed via restriction enzyme digestion and via DNA sequencing. Design and construction of "donor" vectors for NHEJ-directed DNA repair

[0347] Two strategies of integration of DNA into Fad3 were undertaken; gene splicing, where an expression cassette was inserted into a single ZFN-induced double- stranded break and gene-editing where a portion of the gene was removed by the use of two ZFN-induced double-stranded breaks and an expression cassette was inserted to repair the gap.

[0348] For each integration method, gene splicing or gene-editing, two vectors were constructed. The first encoded a turboGFP (tGFP) gene expression cassette and the second encoded a gene expression cassette to confer resistance to the antibiotic hygromycin. The tGFP expression cassette consisted of the promoter, 5' untranslated region and intron from the Arabidopsis thaliana polyubiquitin 10 (UBQ10) gene (Norris et al, Plant Molecular Biology 1993, 21(5), 895-906) followed by the tGFP coding sequence (Evrogen, Moscow, Russia). The tGFP coding sequence was codon- optimized for expression in dicot plants and the 3' untranslated region (UTR) comprising the transcriptional terminator and polyadenylation site of open reading frame 23 (ORF23) of A. tumefaciens pTil5955 (Barker et al, Plant Molecular Biology 1983, 2(6), 335-50). The hygromycin resistance gene expression cassette consisted of the 19S promoter including a 5' UTR from cauliflower mosaic virus (CaMV) (Cook and Penon Plant Molecular Biology 1990 14(3), 391-405) followed by the

hygromycin phosphotransferase (hph) gene ( aster et al Nucleic Acids Research 1983 11 (19), 6895-6911). The hph gene been codon-optimized for expression in dicots and was flanked by a 3 'UTR comprising the transcriptional terminator and polyadenylation site of Open Reading Frame 1 (ORFl) of A. tumefaciens pTil5955 (Barker et al, Plant Molecular Biology 1983, 2(6), 335-50). Both cassettes were synthesized by a commercial gene synthesis vendor (GeneArt, Life Technologies, Regensberg, Germany).

[0349] Vectors for gene splicing were constructed by cloning two tandem copies of the ZFN recognition sequence targeted by the ZFN encoded in the vector pDAB 10782. Vectors for gene editing were constructed by cloning one copy of each of the ZFN recognition sequences targeted by the ZFNs encoded in the vectors pDAB107827 and pDAB107828. In both cases the two ZFN recognition sequences were separated by the recognition sequences for BamHl and Notl restriction endonucleases. The tGFP and HPH cassettes were cloned into the BamHI and Not! sites of each vector resulting in four "donor" vectors: pDAS000340 (hygromycin- resistant gene-splicing donor: SEQ ID NO:352, Figure 24), pDAS000341 (tGFP reporter gene splicing donor: SEQ ID NO:353, Figure 25) , pDAS00342

(hygromycin-resistant gene-editing donor: SEQ ID NO:354, Figure 26) and pDAS000343 (tGFP reporter gene editing donor: SEQ ID NO:355, Figure 27).

[0350] Colonies of the assembled plasmids were initially screened by restriction endonuclease digestion of DNA purified from overnight cultures of E. coli. Restriction endonucleases were obtained from New England BioLabs (NEB, Ipswich, MA) and Promega (Promega Corporation, WI). Plasmid preparations were performed using the QIAPREP SPIN MINIPREP KIT™ (Qiagen, Hilden, Germany) or the PURE YIELD PLASMID MAXIPREP SYSTEM™ (Promega Corporation, WI) following the instructions of the suppliers. After the restriction fragments were confirmed by agarose gel electrophoresis of resulting fragments, plasmid DNA of selected clones were sequenced using ABI Sanger Sequencing and BIG DYE

TERMINATOR V3.1™ cycle sequencing protocol (Applied Biosystems, Life

Technologies). Sequence data were assembled and analyzed using the Sequencher™ software (Gene Codes, Ann Arbor, MI).

Maintenance of plant material for protoplast isolation

[0351] Mesophyll derived protoplasts were isolated from three-week old sterile shoot cultures of Brassica napus (DH10275). The corresponding seeds were germinated following the methods herein described. The seeds were surface-sterilized using 70% ethanol for 1 minute and gently shaken followed by 3-4 rinses in sterile double-distilled water. The seeds were subsequently sterilized using 20% bleach and 10 μΐ of Tween 20. The seeds were further treated with the bleach on a table top shaker at approximately 100 RPM, for 15 minutes followed by 3-4 rinses in sterile double-distilled water, seeds were carefully transferred to a sterile filter paper to remove the excess moisture and plated on seed germination medium (½ strength MS/B5 Vitamins + 1% sucrose + 0.8% Agar; pH 5.8.

[0352] Approximately, 50-60 ml of media was poured into each PETRI™ dish

(15 X 100 mm) and placed with a slight angle using a support). Approximately 50 seeds were placed per plate. The plates were incubated upright at 22°C in 16h/d light

(20 μηιοΐ m^~2 s^"l) for 6 days. Hypocotyl segments of 0.5 cm size were dissected from the six day old seedlings and cultured on shoot induction medium (MS/B5 Vitamins + 3% sucrose + 500 mg/L MES + BAP (13μπι) + Zeatin (5μιη) + Silver Nitrate (5mg/L) + 0.8% Agar (pH 5.8). The medium was poured in 100 x 20 mm sterile PETRI™ dish; approximately 20 explants were placed per plate. Shoot meristems that appeared after 3-4 weeks were transferred to shoot elongation medium (MS/B5 Vitamins + 2% sucrose + 500mg/L MES + BAP (2μπι) + GA-3 (0.1 μπι) + 0.8% Agar (pH 5.8) and poured in 250 ml culture vessels) and the cultures were maintained in this medium for 4 weeks with one round of sub-culturing in between. Shoots of 2-3 cm height were then transferred to root initiation media (1/2 strength MS/B5 Vitamins + 1% sucrose + 500mg/L MES + IBA (2.5 μηι) + 0.6% Agar (pH 5.8) and poured in 700 ml culture vessels) for root development. Rooted shoots were sub-cultured in fresh root initiation media at 3-4 weeks intervals as stem cuttings for two-three rounds before use. The cultures were maintained throughout at 22°C in 16h/d light (30 μηιοΐ m^"2 s"l). Isolation and purification of mesophyll protoplasts

[0353] In vitro grown DH 12075 Brassica napus plants were used as the explant source for isolating mesophyll protoplasts. To isolate the protoplasts, the 3rd to 4^th upper fully expanded leaves from 3 - 4 weeks old plantlets were cut with a sharp scalpel into small strips (0.5 to 1 mm) for protoplast isolation. Enzymatic digestion was carried out by treating 250-500 mg of leaf material with 25 ml of digestion buffer

(1.2% (w/v) Cellulase "ONOZUKA™" R10 and 0.2% (w/v) MACEROZYME® R10 (Source - Duchefa) dissolved in K4 media (Spangenberg et al., 1998)). The PETRI™ dish containing the leaf material and digestion buffer was sealed with PARAFILM™ and incubated at room temperature for 12 to 15 h in darkness. After overnight incubation the digests were filtered through a BD^® cell strainer (mesh size 70μηι). Protoplast suspensions (5-6 ml) collected in a 14 ml round bottomed tube was over layered with 1 ml of W5 washing buffer (154 mM NaCl, 125 mM CaCl2, 5 mM KC1 and 5 mM glucose; pH 5.8 Menzel et al. (1981)).

[0354] The protoplast suspensions were further centrifuged at 400 RPM for 10 min. After centrifugation, protoplasts that floated in the interphase were withdrawn and washed by centrifugation using 10 ml of W5 buffer at 400 RPM for 10 min. After the final wash, isolated protoplasts were resuspended at a density of IX 10⁶ protoplasts per mL of W5 buffer and incubated for 1 hour before transfections. Assessment of protoplast yield and viability

[0355] Protoplasts yield was assessed using a haemocytometer following the method of Sambrook and Russell, (2006). The cell viability was tested using

400mg/L of Evans blue stain dissolved in 0.5 M of Mannitol as described by Huang et al. (1996) with few minor modifications to the protocol.

PEG 4000 mediated DNA delivery

[0356] Before delivery to B. napus protoplasts, plasmid DNA of each donor and ZFN construct was prepared from cultures of E. coli using the PURE YIELD PLASMID MAXIPREP SYSTEM® (Promega Corporation, Madison, WI) following the instructions of the suppliers. Aliquots of donor and ZFN plasmid DNA were prepared in three molar ratios: 1 :1 (30 μg of each plasmid), 5 : 1 (donor plasmid to ZFN plasmid to a total of 30 μg of plasmid DNA) and 10:1 (donor plasmid to ZFN plasmid to a total of 30 μg of plasmid DNA). Additionally, donor-only and ZFN-only aliquots (30 μg) were prepared as controls. The amounts of DNA delivered to the B. napus protoplasts via the PEG4000 mediated transformation are summarized in Table 17.

Table 17: Quantities of ZFN and donor DNA delivered to protoplasts

[0357] Each aliquot of plasmid DNA was applied to one million protoplasts

(viability >95) suspended in Ι ΟΟμΙ of transformation buffer (15mM MgCl2, 0.1% (w/v) mo holinoethanesulphonic acid (MES) and 0.5 M Mannitol; pH 5.8) followed by 150 μΐ of PEG solution (40% (w/v) PEG 4000 in 0.4 M Mannitol and 0.1 M Ca (N03)2 (pH 6-7) Spangenberg and Potrykus (1995). After 10-15 min of incubation at room temperature, 5 ml of W5 buffer was added in a drop wise manner and the protoplasts were gently mixed. Another 5ml of W5 buffer was added as a slow stream to the protoplasts suspension. Protoplasts were mixed gently and centrifuged at 400 RPM for 10 min and the W5 supernatant was removed carefully leaving behind the protoplasts in the form of a pellet. Transfected protoplasts were then incubated in 1 ml of W5 buffer at room temperature until they were embedded in bead type cultures. The transfected protoplasts were embedded following the sodium alginate method as described below.

Culturing of mesophyll derived protoplasts to recover viable microcalli

[0358] Before embedding the transfected protoplasts were centrifuged at 400

RPM for 10 min and the W5 buffer was carefully removed. The protoplasts were then resuspended in 1.0 ml of 0.5 M Mannitol and incubated on ice. To this equal volume of 1.0 % sodium alginate was added and mixed gently. The protoplasts suspension was incubated in ice until it was embedded. Bead forming solution (0.4 M Mannitol + 50mM CaCl₂ (pH 5.8)) was transferred to a sterile six well plate (3-4 ml per well) using a serological pipette. Exactly 1.0 ml of the protoplasts suspension was added in a drop wise manner using a 1 ml pipette into the bead forming solution and each transfected sample (ca. 5 x 10^ protoplasts) was embedded per well. The protoplasts suspension was incubated for 1-2 hours at room temperature to form sodium alginate beads. After the incubation period the bead forming solution was carefully removed and replaced with 4-5ml of 1 : 2 mixture of K3+H:A media (Spangenberg et al 1998) supplemented with 1.5 mg/L of Hygromycin. The protoplasts were cultured for 3-4 weeks in darkness at 22°C in a shaker (50 RPM). After 3-4 weeks the resistant microcalli (0.5-1.0mm) were released by treating with depolymerisation buffer (0.3 M Mannitol + 20mM Sodium Citrate (pH 5.8)). After removing the liquid media 3-4 ml of depolymerisation buffer was added to each well containing the bead-type cultures and incubated at room temperature for 2 hours. Using a sterile forceps the beads were gently mixed to enhance the efficient release of the microcalli. Next a sterile 1.0 ml pipette was used to gently mix gelling agent that was released in the depolymerisation buffer and subsequently removed. The microcalli was washed twice using 5 ml of liquid A media and the microcalli was resuspended in sufficient quantity of liquid A (50 ml of liquid A was used for one ml of the settled cell volume (SCV: this was measured after transferring all the released microcalli to a sterile 50 or 15 ml falcon tube and allowed to settle down for 5 min)). After mixing the microcalli uniformly, 0.5 ml of the microcalli suspended in the liquid A media was transferred to Bl media (MS/MS Vitamins + 3.5 % Sucrose + 500mg/L MES + BAP (5μπι) + NAA (5μιη) + 2, 4-D (5μπι) + 1.5 mg/L Hygromycin + 0.7 % Agarose Type I (pH 6.0) and poured in 100 x 20 mm sterile PETRI™ dish) and using 1-2 ml of additional liquid A media the microcalli was distributed uniformly in the Bl media and the excess liquid A media was carefully removed from each plate. The plates were sealed using a micropore tape which enhanced the embryo maturation. The cultures were maintained at 22°C in

16h/d light (30 μιηοΐ nr² s"¹).

Proliferation and regeneration of shoots from mesophyll derived protoplasts

[0359] Hygromycin resistant colonies were picked from Bl media (microcalli derived from both S A and SP methods) after 2-3 weeks of incubation and transferred to B2 media (MS/MS Vitamins + 3.0 % Sucrose + 500mg/L MES + 500mg/L PVP + 5mg/L Silver nitrate + 5mg/L 2i P + NAA (0.5 μπι) + GA-3 (0.3 μιη) + 1.5 mg/L Hygromycin + 0.7 % Agarose Type I (pH 5.8) and poured in 100 x 20 mm sterile PETRI™ dish). Approximately 25-30 calli were placed per plate and the plates were sealed using PARAFILM™ and incubated at 22°C in 16h/d light (30 μιηοΐ nr² s^"1). Hygromycin resistant colonies were subsequently recovered after 5-6 rounds of sub- culturing in B2 media at two weeks interval. The number of calli per plate was reduced to 12-15 after a third round of sub-culturing. Shoot primordias that appear after 10-12 weeks were carefully recovered along with the residual calli and transferred to shoot elongation medium (MS/B5 Vitamins + 2% sucrose + 500mg/L MES + BAP (2μιη) + GA-3 (0.1 μπι) + 300mg/L Timentin + 1.5 mg/L Hygromycin + 0.8% Agar (pH 5.8) and poured in 250 ml culture vessels). The shoots that survive after 2- 3 rounds of Hygromycin selection were transferred to rooting media (1/2 strength MS/B5 Vitamins + 1% sucrose + 500mg/L MES + IBA (2.5 μιη) + 1.5 mg/L Hygromycin + 0.6% Agar (pH 5.8) and poured in 700 ml culture vessels). Isolation of Genomic DNA from mesophyll protoplasts

[0360] Transfected protoplasts were transferred from the 3 cm PETRI™ dish to a 2 mL microfuge tube. The cells were pelleted by centrifugation at 70 g and the supernatant was removed. To maximize the recovery of transfected protoplasts, the PETRI™ dish was rinsed three times with 1 mL of wash buffer. Each rinse was performed by swirling the wash buffer in the PETRI™ dish for 1 minute, followed by transfer of the liquid to the same 2 ml microfuge tube. At the end of each rinse, the cells were pelleted by centrifugation at 70 g and the supernatant was removed. The pelleted protoplasts were snap frozen in liquid nitrogen before freeze drying for 24 h in a LABCONCO FREEZONE 4.5® (Labconco, Kansas City, MO) at -40°C and 133 x 10^~3 mBar pressure. The lyophilized cells were subjected to DNA extraction using the DNEASY® PLANT DNA EXTRACTION MINI KIT (Qiagen) following the manufacturer's instructions, with the exception that tissue disruption was not required and the protoplast cells were added directly to the lysis buffer.

Isolation of genomic DNA from callus tissue

[0361] Individual calli was snap frozen in liquid nitrogen before freeze drying for 24 h in a LABCONCO FREEZONE 4.5® (Labconco, Kansas City, MO) at -40°C and 133 x 10^"3 mBar pressure. The lyophilized calli was subjected to DNA extraction using the DNEASY® PLANT DNA EXTRACTION MAXI kit (Qiagen, Hilden, Germany) following the manufacturer's instructions.

Isolation of genomic DNA from leaf tissue

[0362] Thirty (30) mg of young leaf tissue from regenerated plants was snap frozen in liquid nitrogen before freeze drying for 24 h in a LABCONCO FREEZONE 4.5® (Labconco, Kansas City, MO) at -40°C and 133 x 10^"3 mBar pressure. The lyophilized calli was subjected to DNA extraction using the DNEASY® PLANT DNA EXTRACTION MAXI KIT (Qiagen, Hilden, Germany) following the manufacturer's instructions.

PCR assays of genomic DNA for NHE J-mediated splicing and editing of Fad3C

[0363] Detection of integration of donor DNA to the Fad3C gene of B. napus was done by a series of PCR where at least one primer was specific to the Fad3C locus (Table 18) and a second primer specific to either the promoter or terminator of the gfp cassette (Table 18 and Figure 28 A). Specificity was obtained by designing oligonucleotides where the last base pair aligned to a SNP that differentiated Fad3C genomic sequence from the other copies of Fad3 genes and including a

phosphorothioate internucleotide linkage before this base pair as indicated by an asterisk [*]. This design, used in combination with a polymerase having proofreading activity, directed specific amplification of each Fad3C or Fad3A allele and excluded other Fad3 copies as noted. Each primer set was empirically tested for amplification of the correct gene copies through Sanger-based sequencing of the PCR amplification products obtained from wild type B. napus.

Table 18: Oligonucleotide sequences used to detect integration of DNA into ZFN- induced double-stranded breaks.

* Indicates phosphorothioate internucleotide linkages to direct specific amplification (with proofreading polymerase) of Fad3C or Fad3A to exclusion of other copies of Fad3 as noted. Each primer set was empirically tested for amplification of the correct gene copies by Sanger-based sequencing of the PCR amplification products obtained from wild type B. napus.

Detection of gene addition to F dSC by non-homologous end joining in

protoplasts

[0364] Genomic DNA was extracted from protoplast pools (one million protoplast per pool) to which donor DNA encoding a functional tGFP reporter cassette (pDAS000341 or pDAS000343), ZFN DNA (pDAB 107827 or

pDAB 107828) or a mixture of donor and ZFN DNA had been delivered twenty- four hours earlier. Quantities of DNA delivered for transformation are described above. PCR products were cloned into plasmid vectors. The genomic editing occurs independently in each cell giving rise to a variety of different insertion events, by cloning into a plasmid vector, each genomic edit can be sequenced without ambiguity. Several clones were sequenced on an ABI3730XL® automated capillary

electrophoresis platform. Analysis of gene sequences was done using

SEQUENCHER SOFTWARE V5.0™ (GeneCodes, Ann Arbor, MI).

[0365] Evidence of gene addition to Fad3C locus by editing or splicing was provided by amplification of both the 5' and 3' Fad3C-cassette junctions from genomic DNA extracted from protoplasts using the primers described in Table 18. Products of PCR amplification with primers "FAD3CNHEJ-L4-F2" and

"AtUbiNFfEJ-Rl" was completed to amplify the 5' junction of tGFP cassette and Fad3C. PCR amplification with primers "FAD3CNHEJ-L4-R2" and

"AtORF23tNHEJ-Fl" was completed to amplify the 3' junction of tGFP cassette and Fad3C. PCR amplification with primers "F AD3 CNHE J-L4-F2" and "FAD3CNHEJ- L4-R2" was completed to amplify across the double strand breaks induced by ZFN 28051-2A-28052. No amplification was observed from protoplasts to which ZFN plasmid or donor plasmid alone had been delivered. All junction sequences were indicative of insertion of the tGFP cassette at the Fad3C locus via an NHEJ-mediated repair pathway. Deletions of varying lengths from either or both the genome and the cassette were observed as well as the addition of sequences derived from the vector backbones (either from the donor or ZFN) being inserted between the genome and the cassette. Detection of gene addition to Fad3C by non-homologous end joining in callus tissue regenerated from protoplasts

[0366] Further evidence of splicing and editing of the Fad3C locus was obtained from callus tissue regenerated from protoplasts on selection (1.5 mg/L hygromycin, as described above) to which donor DNA encoding an hph cassette

(pDAS000340 or pDAS000342), ZFN DNA only (pDAB 107827 or pDAB 107828) or donor and ZFN DNA had been delivered (quantities of DNA delivered are given in Table 17). DNA was extracted from approximately 80 calli for each ratio, except editing 1 : 1 : 1 , for which no calli survived, four weeks after protoplast transfection.

[0367] Integration of the hph cassette into the B. napus genome (fwat Fad3C or randomly) was confirmed by TAQMAN™ qPCR using primers (SEQ ID NO:402; F - 5'CTTACATGCTTAGGATCGGACTTG 3', SEQ ID NO:403; R - 5 'AGTTCC AGC ACC AG ATCT AACG 3' ) and probe (SEQ ID NO:404; 5'

CCCTGAGCCCAAGCAGCATCATCG 3') specific to the hph gene. These primer- probe pairs were used in a duplex reaction with primers (SEQ ID NO:405; F - 5' CGGAGAGGGCGTGGAAGG 3', SEQ ID NO:406; R - 5'

TTCGATTTGCTACAGCGTCAAC 3' ) and probe (SEQ ID NO:407; 5'

AGGCACCATCGCAGGCTTCGCT 3') specific to the B. napus high mobility group protein I/I (HMG I/Y), which is present as a single copy on the A genome (Weng et al., 2004, Plant Molecular Biology Reporter). Amplification was performed on a CI 000 thermal cycler with the CFX96 or CF384 REAL-TIME PCR DETECTION SYSTEM™ (BioRad, Hercules, CA). Results were analyzed using the CFX

MANAGER™ (BioRad) software package. Relative quantification was calculated according to the 2^"ΔΔα method (Livak and Schmittgen, 2001), which provided an estimation of the number of copies of hph cassette inserted into the genome. Evidence of NHEJ-mediated splicing and editing of Fad3C was obtained by conducting PCR assays with one primer specific to Fad3C and a second primer specific to either the promoter or terminator of the hph cassette (Table 17 and Figure 28B). Due to limited quantities of DNA obtained from callus tissue, only integration in the sense orientation was assayed. PCR products were gel-purified using QIAQUICK

MINIELUTE PCR PURIFICATION KIT™ (Qiagen) and sequenced using a direct Sanger sequencing method. The sequencing products were purified with ethanol, sodium acetate and EDTA following the BIGD YE® v3.1 protocol (Applied

Biosystems) and sequenced and analyzed as above.

[0368] The numbers of calli containing the donor cassette in each experiment are given in Table 18. Evidence of donor gene addition to the Fad3C locus by editing and/or splicing was provided by PCR amplification (with primers shown in Table 19) across the ZFN cut sites and both the 5' and 3' Fad3C-hph cassette junctions. PCR amplification of the genomic DNA isolated from callus tissue recovered from control protoplasts which were transformed with only the hph plasmid (pDAS000340 and pDAS000342) or only the ZFN plasmid (pDAB 107827 and pDAB 107828) did not result in the production of PCR amplification products.

[0369] The PCR amplicons produced from the amplification of the 5' and 3'

Fad3C-hph cassette junctions were purified from the agarose gel and sequenced to confirm specificity of the integration within the Fad3C genomic locus. The results of the sequencing analysis of the PCR products indicated that each isolated callus which was generated from an individually transformed protoplast only produced a single PCR amplification product and did not contain cells of mixed genotypes.

[0370] In NHEJ-mediated integration of donor sequences within the Fad3C genomic locus experiments the frequency of addition to the target locus (as defined by any part of the donor DNA vector being amplified from the target locus) was 42%, 46% and 32% for the DNA concentrations of 1 :1, 5:1, and 10:1 (Donor DNA: ZFN DNA), respectively. See, Table 20. The frequency of on-target splicing was determined by assaying whether both cassette junctions were amplifiable and from the sequencing of the PCR products. These results verified that the cassette was inserted at the target locus in the correct orientation. The frequency of integration was calculated as 4%, 3% and 3% for the 1 : 1 , 5 : 1 and 10: 1 of Donor plasmid DNA: ZFN plasmid DNA concentrations, respectively. In gene editing experiments the frequency of addition to the target locus defined by any part of the donor DNA vector being amplified from the target locus, was 66% and 65% for the 5:1 :1 and 10:1 :1 of Donor plasmid DNA: ZFN plasmid DNA concentrations, respectively. See, Table 21. The frequency of on-target editing, was determined by both cassette junctions being amplifiable and producing a sequence of PCR products. These results verified that the cassette was inserted at the target locus in the correct orientation at frequencies of 3% and 6% for the 5 : 1 : 1 and 10: 1 : 1 of Donor plasmid DNA: ZFN plasmid DNA concentrations, respectively. As observed in the protoplast assays, the base pairs were either deleted or additional bases were inserted between the genome and the cassette as a result of the cleavage of the genomic locus by the ZFN (Figures 30-31).

[0371] In certain instances the PCR products resulted in an addition of nucleotide sequences within the target locus, no PCR product, or a larger PCR product than observed in wild-type samples. These results which were produced from the PCR amplification using primers flanking the cut site indicated that the locus had been disrupted in both pairs of chromosomes (Figures 30-31). In some of the instances more than one band was amplified at the splice junctions (Figures 30-31) indicating that different insertions had occurred independently in each copy of the genome.

Table 19: Number of calli positive for presence of hph after four weeks on selection

Table 20: Number of calli with hph inserted by splicing at FadC locus at the DSB

induced by ZFN28051-2A-28052

* no base pairs deleted or additional base pairs inserted at cut site Table 21: Number of calli with hph inserted by editing at FadC locus at the cut sites induced by ZFN28051-2A-28052 and ZFN28053-2A-28054

* no base pairs deleted or additional base pairs inserted at cut site

Detection of gene addition to Fad3C by non-homologous end joining in plants

[0372] DNA was extracted from plants that were regenerated from protoplasts and transferred to potting medium (as described above). The majority of plants recovered were estimated to contain only 1-2 copies of the hph cassette encoded in the donor DNA. Plants were analyzed with the same suite of assays described for callus tissue as well as with assays to determine if the cassette had inserted in an antisense orientation or donor integration at the Fad3 A locus.

Table 22: Estimated copy number of plants regenerated from protoplasts. For each ratio three transfections of one million protoplasts were performed.

[0373] The frequency of on-target splicing, where the hph cassette was inserted into Fad3C in either direction, was 51%, 32% and 56% for Donor DNA: ZFN DNA at concentrations of 1 :1, 5:1 and 10:1, respectively (Table 23). Of these results, 35% 32% and 50% (1 :1, 5:1 and 10:1) were inserted in the forward orientation (Table 23).

[0374] The frequency of on-target editing, where the hph cassette was mserted into Fad3C in either direction, replacing the area from locus 4 to locus 6, was 2% and

0% for Donor DNA: ZFN DNA: ZFN DNA at concentrations of 5 : 1 : 1 and 10: 1 : 1 , respectively (Table 24). In addition, when both ZFNs were delivered at 5:1 :1, 2% and spliced into locus 4 and 10% spliced into locus 6 and when both ZFNs were delivered at 10: 1 :1 10% and spliced into locus 4 and 15% spliced into locus 6.

[0375] The bands obtained can be sequenced to determine the number of perfect borders. Additionally, plants can be screened for off-target insertions to determine the frequency of integration of hph at sites other than Fad3, and the frequency of integration at Fad3 A rather than Fad3C.

Table 23: Number of plants with hph inserted by splicing at FadC locus at the

DSB induced by ZFN28051-2A-28052

* no base pairs deleted or additional base pairs inserted at cut site

Table 24: Number of plants with hph inserted by editing at FadC locus at the cut

sites induced by ZFN28051-2A-28052 and ZFN28053-2A-28054

DAB107828

* no base pairs deleted or additional base pairs inserted at cut site

Example 8: Targeted integration into and disruption of Corn Event DAS-59132 Characterization of an Endogenous Genomic Locus for Gene Targeting

[0376] The genomic locus of Corn Event DAS-59132 was described in International Patent Application No. WO 2009/100188 A2. Corn Event DAS-59132 comprises the Cry34Abl, Cry35Abl, and PAT transgene expression cassettes. These transgene expression cassettes were integrated into chromosome 8 of the B73 maize genome derived region of Hi-II maize germplasm (D. D. Songstad, W. L. Petersen, C. L. Armstrong, American Journal of Botany, Vol. 79, pp. 761-764, 1992) as a full length T-strand insert. In addition, the genomic DNA surrounding the transgenic locus lacked any large deletions relative to the native B73 sequence, and was generally devoid of repetitive elements except for a single, small repetitive element.

[0377] The genomic locus in which Corn Event DAS-59132 integrated was selected as an endogenous genomic locus for gene targeting. The selection of this endogenous genomic locus was based on the characterization of Corn Event DAS- 59132. This event resulted from the integration of a T-strand into the endogenous genomic locus, and the subsequent expression of three transgene expression cassettes. In addition, there was minimal alteration of normal growth and development of corn plants which comprise Corn Event DAS-59132. The event retained the agronomic and breeding characteristics and was comparable in agronomic performance to non- transformed control plants.

[0378] An embodiment of the disclosure includes polynucleotide sequences that can be targeted for the integration of a transgene. The full length DNA molecule (PHI17662A) used to transform Corn Event DAS-59132, the 3' end of the genomic flanking sequence, and the PHI17662A / 3' maize genome junction were described in the disclosure of International Patent Application No. WO 2009/100188 A2, and are disclosed in this filing as SEQ ID NO:427, SEQ ID NO:428 and SEQ ID NO:429, respectively. The 5' end of the genomic flanking sequence, and the genomic locus where Corn Event DAS-59132 integrated into the corn genome is disclosed in this filing as SEQ ID NO:430 and SEQ ID NO:431 , respectively. The genomic locus listed as SEQ ID NO:431 was used to design zinc finger proteins for gene targeting. Production of Zinc Finger Proteins Designed to bind the Genomic Locus for Corn Event DAS-59132

[0379] Zinc finger proteins directed against DNA sequences which comprise the genomic locus for Corn Event DAS-59132 {see, Figure 32) were designed as previously described. See, e.g., Urnov et al. (2005) Nature 435:646-651. Exemplary target sequence and recognition helices are shown in Tables 25A (recognition helix regions designs) and Table 25B (target sites). In Table 25B, nucleotides in the target site that are contacted by the ZFP recognition helices are indicated in uppercase letters.

Table 25A: Genomic locus for Corn Event DAS-59132-binding zinc finger desi ns

Table 25B: Target Sequences for zinc finger proteins

[0380] The Corn Event DAS-59132 zinc finger designs were incorporated into vectors encoding a protein having at least one finger with a CCHC structure. See, U.S. Patent Publication No. 2008/0182332. In particular, the last finger in each protein had a CCHC backbone for the recognition helix. The non-canonical zinc finger-encoding sequences were fused to the nuclease domain of the type IIS restriction enzyme Fokl (amino acids 384-579 of the sequence of Wah et al. (1998) Proc. Natl. Acad. Sci. USA 95 : 10564- 10569) via a four amino acid ZC linker and an opaque-2 nuclear localization signal derived from Zea mays to form Corn Event DAS-59132 zinc-finger nucleases (ZFNs). Expression of the fusion proteins in a bicistronic expression construct utilizing a 2A ribosomal stuttering signal as described in Shukla et al. (2009) Nature 459:437-441 was driven by a relatively strong, constitutive and ectopic promoter such as the CsVMV promoter.

[0381] The optimal zinc fingers were verified for cleavage activity using a budding yeast based system previously shown to identify active nucleases. See, e.g., U.S. Patent Publication No. 20090111119; Doyon et al. (2008) Nat Biotechnol.

26:702-708; Geurts et al. (2009) Science 325:433. Zinc fingers for the various functional domains were selected for in-vivo use. Of the numerous ZFNs that were designed, produced and tested to bind to the putative Corn Event DAS-59132 genomic polynucleotide target sites, four pairs of ZFNs were identified as having in vivo activity at high levels, and selected for further experimentation. See, Table 25A. These ZFNs were characterized as being capable of efficiently binding and cleaving the four unique Corn Event DAS-59132 genomic polynucleotide target sites in planta.

[0382] Figure 1 shows the genomic organization of the Corn Event DAS-

59132 locus in relation to the ZFN polynucleotide binding/target sites of the four ZFN pairs. The first three ZFN pairs (E32 ZFN1, E32 ZFN2, and E32 ZFN3) bind upstream of the Corn Event DAS-59132 T-strand insert, the second three ZFN pairs (E32 ZFN4, E32 ZFN5, and E32 ZFN6) bind downstream of the Corn Event DAS- 59132 T-strand insert. After testing the ZFN pairs in the budding yeast assay, ZFN pairs which optimally bound the Corn Event DAS-59132 locus were advanced for testing in a transient corn transformation assay. Zinc finger nuclease constructs for expression in maize

[0383] Plasmid vectors containing ZFN expression constructs of the four exemplary zinc finger nucleases, which were identified using the yeast assay and described in Example 2, were designed and completed using skills and techniques commonly known in the art. Each zinc finger-encoding sequence was fused to a sequence encoding an opaque-2 nuclear localization signal (Maddaloni et al. (1989) Nuc. Acids Res. 17(18):7532), that was positioned upstream of the zinc finger nuclease.

[0384] Next, the opaque-2 nuclear localization signal ::zinc finger nuclease fusion sequence was paired with the complementary opaque-2 nuclear localization signal: :zinc finger nuclease fusion sequence. As such, each construct consisted of a single open reading frame comprised of two opaque-2 nuclear localization signal: :zinc finger nuclease fusion sequences separated by the 2A sequence from Thosea asigna virus (Mattion et al. (1996) J Virol. 70:8124-8127). Expression of the ZFN coding sequence was driven by the highly expressing constitutive Zea mays Ubiquitin 1 Promoter (Christensen et al. (1992) Plant Mol. Biol. 18(4):675-89) and flanked by the Zea mays Per 5 3' polyA untranslated region (US PAT NO. 6,699,984). The resulting four plasmid constructs were confirmed via restriction enzyme digestion and via DNA sequencing. Figures 33 and 34 provide a graphical representation of the completed plasmid construct. The ZFN expressed in plasmid construct, pDAB 105906 (Figure 33), contains "Fok-Mono" which is a wildtype Fokl endonuclease. The ZFN expressed in plasmid construct, pDABl 11809 (Figure 34), contains "Fokl -ELD" which is a modified Fokl endonuclease. The modified Fokl endonuclease contains alterations as described in Doyon Y., Vo T., Mendel M., Greenberg S., Wang J., Xia D., Miller J., Urnov F., Gregory P., and Holmes M. (2010) Enhancing zinc-fmger- nuclease activity with improved obligate heterodimeric architecture. Nature Methods, 8(1); 74-79.

[0385] A donor construct was designed to integrate into the ZFN cleaved genomic DNA of the Corn Event DAS-59132 genomic locus. Figure 35 illustrates the donor construct, pDAB 100655, which consists of a single gene expression cassette. This single gene expression cassette is driven by the Zea mays Ubiquitin 1 promoter (Zm Ubil promoter) :: the aad-icoding sequence (AAD1; US Patent No. 7,838,733) :: and is terminated by the Zea mays Per 5 3' untranslated region (ZmPer5 3'UTR). The construct contains a pair of repeated E32 ZFN6 binding sequences which were included downstream of the aad-1 gene expression cassette. The various gene elements were assembled in a high copy number pUC based plasmid. Transient transformation of maize to determine ZFN efficiency

[0386] Maize Hi-II embryogenic cultures were produced as described in U.S.

Pat. No. 7,179,902, and were used to evaluate and test the efficiencies of the different ZFNs. Plasmid DNA consisting of pDAB105901 , pDAB105902, pDAB105903, pDAB 105904, pDAB 105905 and pDAB 105906 were transiently transformed into maize callus cells to compare the cutting frequency of different ZFNs against a standard tested ZFN, pDAB7430, which was designed to the inositol polyphosphate 2-kinase gene locus within the maize genome as described in US Patent Application No. 2011/0119786.

[0387] From the cultures, 12 mL of packed cell volume (PCV) from a previously cryo-preserved cell line plus 28 mL of conditioned medium was subcultured into 80 mL of GN6 liquid medium (N6 medium (Chu et ah, (1975) Sci Sin. 18:659-668), 2.0 mg/L 2, 4-D, 30 g/L sucrose, pH 6.0) in a 500 mL Erlenmeyer flask, and placed on a shaker at 125 rpm at 28°C. This step was repeated two times using the same cell line, such that a total of 36 mL PCV was distributed across three flasks. After 24 hours, the contents were poured into sterile a PETRI™ dish and the GN6 liquid media was removed. Slightly moistened callus was transferred to a 2.5 cm diameter circle on GN6 S/M solid medium (N6 Medium (Chu et ah, (1975) Sci Sin. 18:659-668), 2.0 mg/L 2,4-D, 30 g/L sucrose, 45.5 g/L sorbitol, 45.5 g/L mannitol, 100 mg/L myo-inositol, 2.5 g/L Gelrite, pH 6.0) containing filter paper. The plates were incubated in the dark for 4 hours at 28° C.

[0388] Microparticle gold (0.6 micron, BioRad, Hercules, CA,) was prepared for DNA precipitation by weighing out 21 mg into a sterile, siliconized 1.7 mL microcentrifuge tube (Sigma- Aldrich, St. Louis, MO) and 350 iL of ice cold 100% ethanol was added and vortexed for 1 minute. The gold was pelleted by

centrifugation at 10,000 rpm for 15 seconds using a MINISPIN™ centrifuge

(Eppendorf, Hauppauge, NY). After removing the supernatant, 350 of ice cold, sterile water was added, mixed up and down with the pipette and centrifuged at 10,000 rpm for 15 seconds. The wash step was repeated one more time prior to suspending the gold in 350 of ice cold, sterile water. The washed gold was then stored at -20°C until needed.

[0389] For each DNA precipitation, 3 mg of gold in 50 μΤ of water was aliqouted into a siliconized 1.7 mL microcentrifuge tube (Sigma- Aldrich, St. Louis, MO). Plasmid DNA (2.5 μg E32 ZFN in plasmids pDAB 105901, pDAB105902, pDAB105903, pDAB105904, pDAB105905 or pDAB105906 and 2.5 μg IPPK2 ZFN in plasmid pDAB7430) was premixed in 0.6 mL microcentrifuge tubes (Fisher Scientific, Nazareth, PA) and added to the gold suspension gently pipetting up and down 5-10 times to mix thoroughly. Twenty microliters (20 μΐ,) of cold 0.1 M spermidine was then added and gently mixed by pipetting up and down 5-10 times. Fifty microliters (50 μϋ,) of ice cold 2.5 M calcium chloride was added slowly and gently mixed by pipetting up and down 5-10 times. The tube was then capped and allowed to incubate at room temperature for 10 minutes. After centrifuging for 15 seconds at 10,000 rpm, the supernatant was carefully removed and 60 μΙ of ice cold, 100% ethanol was added. The gold DNA mixture was resuspended by gently pipetting up and down 5-10 times.

[0390] For microparticle bombardment, sterilized macrocarriers

(Bio ad, Hercules, CA) were fit into stainless steel holders (BioRad, Hercules, CA) and autoclaved. Nine microliters (9 μί) of gold/DNA suspension was evenly spread in the center of the macrocarrier being sure to pipette up and down so as to keep suspension well mixed between aliquots. Macrocarriers were then placed onto a piece of sterile 125 mm Whatman #4 filter paper (GE Healthcare, Buckinghamshire, UK) on a bed of 8-mesh DRIERITE™ (W.A Hammond Drierite Co., Xenia, OH) in a 140 x 25 mm glass PETRI™ dish. The gold/DNA was allowed to dry completely for about 5-10 minutes. Rupture discs (1100 psi, BioRad, Hercules, CA) were sterilized by soaking for a few seconds in isopropyl alcohol then loaded into the retaining cap of a microparticle bombardment devise (PDS-1000, BioRad, Hercules, CA). An autoclaved stopping screen (BioRad, Hercules, CA) and a loaded macrocarrier was placed into the launch assembly, the lid was screwed on and slide into the

bombardment chamber just under the nozzle. The PETRI™ dish containing target was uncovered and placed in the bombardment chamber 6 cm below the nozzle. A vacuum was pulled (-0.9 bar) and the devise was fired. Steps were repeated for each target blasted. Targets were incubated in dark at a temperature of 28 °C for 24 hours on the same blasting medium. Blasted cells were transferred to recovery GN6 solid recovery medium (N6 medium (Chu et al, (1975) Sci Sin. 18:659-668), 2.0 mg/L 2, 4-D, 30 g/L sucrose, 2.5 g/L Gelrite, pH 6.0) and incubated for additional 48 hours @ 28°C in the dark. Seventy-two hours post bombardment, the cells were harvested into 2 mL EPPENDORP MICROFUGE SAFE LOCK TUBES™ and lyophilized for 48 hours in a VIRTIS MODEL # 50L VIRTUAL XL-70 LYOPHILIZER™ (SP

Scientific, Gardiner NY). Next generation sequencing (NGS) analysis of transiently transformed maize

[0391] The transiently transformed maize callus tissue was analyzed to determine the cleavage efficiency of the zinc finger nuclease proteins.

Sample Preparation:

[0392] Maize callus tissue transiently transformed with the ZFN constructs and two control vectors, pD AB 100664 and pDAB 100665, were collected in 2 mL EPPENDORF™ tubes and lyophilized for 48 hrs. Genomic DNA (gDNA) was extracted from lyophilized tissue using the QIAGEN PLANT DNA EXTRACTION KIT™ (Valencia, CA) according to manufacturer's specifications. The isolated gDNA was resuspended in 200 μΐ of water and the concentration was determined using a NANODROP® spectrophotometer (Invitrogen, Carlsbad, CA). Integrity of the DNA was estimated by running all samples on a 0.8% agarose E-gels (Invitrogen). All gDNA samples were normalized (25 ng/μΐ) for PCR amplification to generate amplicons which would be analyzed via ILLUMINA™ sequencing (San Diego, CA).

[0393] PCR primers for amplification of the genomic regions which span each tested ZFN cleavage site and the control samples were purchased from Integrated DNA Technologies (Coralville, IA). Optimum amplification conditions for the primers were identified by gradient PCR using 0.2 μΜ appropriate primers,

ACCUPRIME PFX SUPERMLX™ (1.1X, Invitrogen) and 100 ng of template genomic DNA in a 23.5

reaction. Cycling parameters were initial denaturation at 95^° C (5 min) followed by 35 cycles of denaturation (95^°C, 15 sec), annealing (55-72^° C, 30 sec), extension (68^°C, 1 min) and a final extension (72^°C, 7 min).

Amplification products were analyzed on 3.5% TAE agarose gels. After identifying an optimum annealing temperature, preparative PCR reactions were carried out to validate each set of PCR primers and for generating the ILLUMINA™ sequencing amplicon.

[0394] For preparative PCR, 8-individual small scale PCR reactions were performed for each template using conditions described above and the resulting PCR products were pooled together and gel purified on 3.5% agarose gels using the QIAGEN MINELUTE GEL EXTRACTION/PURIFICATION KIT™ per manufacturer's recommendations. Concentrations of the gel purified amplicons were determined by NANODROP™ and the ILLUMINA™ sequencing samples were prepared by pooling approximately 100 ng of PCR amplicons from ZFN targeted and corresponding wild type controls. Primers used for the PCR amplicon generation are shown in Table 26 below.

Table 26: Oli onucleotides for am lification of ZFN binding sites

ILLUMINA™ Sequencing and Analysis:

[0395] The ZFNs were designed to recognize, bind and modify specific DNA sequences within the genomic locus of transgenic Corn Event DAS-59132. The efficiency by which the four ZFNs cleaved the genomic locus was assayed to determine which ZFN cleaved most efficiently. ILLUMINA™ sequencing was performed at Cofactor Genomics (St. Louis, MO) and sequences were analyzed using a sequence analysis script. Low quality sequences were filtered out and the remaining sequences were parsed according to unique DNA sequences identifiers. The unique DNA sequences identifiers were then aligned with the reference sequence and scored for insertions/deletions (Indels). To determine the level of cleavage activity, the region surrounding the ZFN cleavage site was scored for the presence of sequence variants which resulted from the INDELs. Cleavage activity for each ZFN in the study was calculated as the number of sequences with indels/lM high quality sequences or as a percentage of high quality sequences with indels. Next, the levels of cleavage efficiency were determined by normalizing the ZFN level of cleavage activity with the activity of a ZFN directed to the IPP2-K gene as described in U.S. Patent Publication No. 2011/0119786. Figure 36 and Table 27 present the cleavage efficiency of the tested ZFNs.

[0396] Event 32 ZFN6 which contains the 25716 and 25717 zinc finger binding domains cleaved the genomic locus of transgenic Corn Event DAS-59132 with the highest efficiency. This ZFN functioned at 380 times the efficiency of the control IPPK2 zinc finger nuclease. Given the surprisingly high levels of cleavage activity of Event 32 ZFN6, this ZFN was selected for advancement to test the integration of a donor DNA fragment into a genomic locus via non homologous end- joining.

Table 27: Cleava e efficienc of the tested eZFNs

Transformation of ZFN in protoplast

[0397] A system for gene targeting was established to target the endogenous genomic loci of Corn Event DAS-59132 and to optimize donor targeting parameters in maize. Double strand breaks were generated within the genome at Corn Event DAS-59132 and repaired by either the non-homologous end joining (NHEJ) or homology dependent repair (HDR).

Protoplast Isolation:

[0398] Maize Hi-II embryogenic suspension cultures were obtained and were maintained on a 3.5 day maintenance schedule. In a 50 mL sterile conical tube a 10 mL solution of sterile 6% (w/v) cellulase and a 10 mL solution of sterile 0.6% (w/v) pectolyase enzyme solutions were pipette into the conical tube using a 10 mL pipette tip. Next, 4 pack cell volumes (PCV) of Hi-II suspension cells were added into the 50 mL tube containing the digest solution and wrapped with parafilm. The tubes were placed on a platform rocker overnight at room temperature for -16-18 hrs. The next morning, the tubes were removed from the shaker. In a sterile 50 mL conical tube the cells and enzyme solution were slowly filtered through a 100 μπι cell strainer. Next, the cells were rinsed using a 100 μιη cell strainer by pipetting 10 mLs of W5 media through the strainer. In a sterile 50 mL conical tube, the cells and enzyme solution were slowly filtered through a 70 μη cell strainer. This straining step was followed by a second straining step, wherein the cells and enzyme solution were slowly strained into a 50 mL conical tube through a 40 μιη cell strainer. Using a 10 mL pipette tip, the 40 μηι cell strainer was rinsed with 10 mL of W5 media to give a final volume of 40 mL and the tube was inverted. Very slowly, 8 mL of sucrose cushion was added to the bottom of the protoplast/enzyme solution. Using a centrifuge with a swing arm bucket rotor, the tubes were spun for 15 minutes at 1500 rpm. The protoplast cells were removed using a 5 mL narrow bore pipette tip. These cells (7-8 mLs) which were observed as a protoplast bane were removed very slowly and put into a sterile 50 mL conical tube. Next, 25 mL of W5 media was used to wash the tubes. The W5 media was added and the tubes were inverted slowly and centrifuge for 10 minutes at 1500 rpm. The supernatant was removed and lOmL of MMG solution was added with slow inversion of the tube to resuspend protoplast pellet. The density of protoplasts were determined using a haemocytometer, the 4 PCV yields -30 million protoplasts. Protoplast Transformation:

[0399] The protoplast cells were diluted to 1.6 million protoplasts per ml using an MMG solution. The protoplasts were gently resuspended by slowly inverting the tube. Next, 300 μΐ_^ of protoplasts (~500k protoplasts) were added to a sterile 2 mL tube, the tubes were inverted to evenly distribute the protoplast cells. Plasmid DNA of a concentration about 40 -80 μg suspended in TE buffer was added to the protoplasts. The different experimental conditions are described in Table 28. The tubes were slowly rolled to suspend the DNA with the protoplasts and the tubes were incubated for 5-10 minutes at room temperature. Next 300 μΐ, of PEG solution was added to the protoplast/DNA solution. Once all the PEG solution had been added, the PEG solution was mixed with the protoplast solution by gently inverting the tube. The cocktail was incubated at room temperature for 15-20 minutes with periodic inverting of the tube(s). After the incubation, 1 mL of W5 solution was slowly added to the tubes and the tubes were gently inverted. Finally, the solution was centrifuged at 1000 rpm for 15 minutes. The supernatant was carefully removed so as not to disturb the cell pellet. Finally, 1 mL of washing/incubating solution was added. The tubes were gently inverted to resuspend the cell pellet. The tubes were covered with aluminum foil to eliminate any exposure to light, and were laid on a rack on their side to incubate overnight. The cells were harvested 24 hours post-transformation for molecular analysis.

Table 28: Different treatment groups were used for the transformation of the protoplast cells. The differing concentrations of the DNA used for the transformations are described below.

Sequence Validation of Targeting

[0400] The results of the ZFN cleavage activity in maize protoplasts were confirmed using the Next Generation Sequencing protocol described above. The sequenced PCR amplified fragments were scored for the presence of sequence variants resulting from indels. Event 32 ZFN6 cleaved the genomic locus of transgenic Corn Event DAS-59132 at about 1.5% of NHEJ/10 ng of targeted amplicon.

[0401] Targeting of an AAD-1 donor cassette into the genomic locus of transgenic Corn Event DAS-59132 into the Hi-II maize transgenic cell suspensions via Non Homologous End Joining (NHEJ) was confirmed via an in-out PCR reaction. The in-out PCR reaction was completed, wherein a first PCR reaction was designed to amplify the junction of the AAD-1 donor and genomic locus of transgenic Corn Event DAS-59132. The resulting amplicon was subjected to a second PCR reaction, wherein primers were designed to bind internally within the first amplicon. The combination of two independent PCR reactions resulted in the removal of background amplifications which may be false-positives. The in-out PCR results of the protoplast transformation experiments demonstrated that the genomic locus of transgenic Corn Event DAS-59132 could be reproducibly targeted with a 5.3 kb AADl plasmid donor and the E32 ZFN6 zinc finger nuclease at a ratio of 1 : 10 μg of DNA (with and without filler DNA comprised of either pUC19 plasmid DNA or salmon sperm DNA). Targeting via a NHEJ method was evidenced by the insertion of the AAD-1 donor cassette in both orientations. The sequence data produced from the PCR reactions resulted in three instances of perfect integration of the donor DNA. Thus it was possible to demonstrate donor targeting into an endogenous maize locus using ZFNs via a NHEJ-DSB repair mechanism.

WHISKERS™ mediated stable transformation of ZFN and donor for targeted integration

[0402] Transgenic events were targeted to the endogenous genomic locus of Corn Event DAS-59132. Constructs as described in Example 2 include the donor sequence (pDAB 100655) and Event 32 ZFN 6 (pDAB 105906).

[0403] Maize callus cells, consisting of 12 mL of packed cell volume (PCV) from a previously cryo-preserved cell line plus 28 mL of conditioned medium was subcultured into 80 niL of GN6 liquid medium (N6 medium (Chu et ah, (1975) Set Sin. 18:659-668), 2.0 mg/L of 2, 4-D, 30 g/L sucrose, pH 5.8) in a 500 mL

Erlenmeyer flask, and placed on a shaker at 125 rpm at 28°C. This step was repeated two times using the same cell line, such that a total of 36 mL PCV was distributed across three flasks. After 24 hours, the GN6 liquid media was removed and replaced with 72 mL GN6 S/M osmotic medium (N6 Medium, 2.0 mg/L 2,4-D, 30 g/L sucrose, 45.5 g/L sorbitol, 45.5 g/L mannitol, 100 mg/L myo-inositol, pH 6.0). The flask was incubated in the dark for 30-35 minutes at 28° C with moderate agitation (125 rpm). During the incubation period, a 50 mg/mL suspension of silicon carbide

WHISKERS™ (Advanced Composite Materials, LLC, Greer, SC) was prepared by adding 8.1 mL of GN6 S/M liquid medium to 405 mg of sterile, silicon carbide WHISKERS™.

[0404] Following incubation in GN6 S/M osmotic medium, the contents of each flask were pooled into a 250 mL centrifuge bottle. After all cells in the flask settled to the bottom, the content volume in excess of approximately 14 mL of GN6 S/M liquid was drawn off and collected in a sterile 1-L flask for future use. The pre- wetted suspension of WHISKERS™ was mixed at maximum speed on a vortex for 60 seconds, and then added to the centrifuge bottle.

[0405] In this example, 159 μg of pDAB100655 (donor sequence) and 11 μg of pDAB 10506 (ZFN) plasmid DNA were added to each bottle. Once the plasmid DNA was added, the bottle was immediately placed in a modified RED DEVIL 5400™ commercial paint mixer (Red Devil Equipment Co., Plymouth, MN), and agitated for 10 seconds. Following agitation, the cocktail of cells, media,

WHISKERS™ and plasmid DNA were added to the contents of a 1-L flask along with 125 mL fresh GN6 liquid medium to reduce the osmoticant. The cells were allowed to recover on a shaker set at 125 rpm for 2 hours. 6 mL of dispersed suspension was filtered onto Whatman #4 filter paper (5.5 cm) using a glass cell collector unit connected to a house vacuum line such that 60 filters were obtained per bottle. Filters were placed onto 60 x 20 mm plates of GN6 solid medium (same as GN6 liquid medium except with 2.5 g/L Gelrite gelling agent) and cultured at 28° C under dark conditions for 1 week. Identification and isolation of putative targeted events integrated within the Corn Event DAS-59132 genomic locus

[0406] One week post-DNA delivery, filter papers were transferred to 60x20 mm plates of GN6 (1H) selection medium (N6 Medium, 2.0 mg/L 2, 4-D, 30 g/L sucrose, 100 mg/L myo-inositol, 2.5 g/L Gelrite, pH 5.8) containing a selective agent. These selection plates were incubated at 28° C for one week in the dark. Following 1 week of selection in the dark, the tissue was embedded onto fresh media by scraping ½ the cells from each plate into a tube containing 3.0 mL of GN6 agarose medium held at 37-38° C (N6 medium, 2.0 mg/L 2, 4-D, 30 g/L sucrose, 100 mg/L myo- inositol, 7 g/L SEAPLAQUE^® agarose, pH 5.8, autoclaved for 10 minutes at 12Γ C).

[0407] The agarose/tissue mixture was broken up with a spatula and, subsequently, 3 mL of agarose/tissue mixture was evenly poured onto the surface of a 100 x 25 mm PETRI™ dish containing GN6 (1H) medium. This process was repeated for both halves of each plate. Once all the tissue was embedded, plates incubated at 28° C under dark conditions for up to 10 weeks. Putatively transformed isolates that grew under these selection conditions were removed from the embedded plates and transferred to fresh selection medium in 60 x 20 mm plates. If sustained growth was evident after approximately 2 weeks, an event was deemed to be resistant to the applied herbicide (selective agent) and an aliquot of cells was subsequently harvested for genotype analysis. In this example, 24 events were recovered from 6 bottles treated. These events were advance for molecular analysis to confirm the integration of the AAD-1 gene within a genomic locus of Corn Event DAS-59132.

Molecular analysis of NHEJ targeting of the Corn Event DAS-59132 genomic locus

[0408] The 24 events that were recovered from the WHISKERS™ mediated transformation, as described above, were analyzed using several different molecular confirmation tools. As a result of the analysis events which contained a copy of the AAD-1 transgene integrated within the Corn Event DAS-59132 genomic locus were identified. Initially the 24 various events were confirmed to contain a copy of the AAD-1 transgene, next the events were analyzed to determine whether the genomic locus of Corn Event DAS-59132 which would suggest that a copy of the AAD-1 transgene had integrated via NHEJ within the genome of the maize cells. The events which were identified to contain a copy of the AAD-1 trans gene within the genomic locus of Corn Event DAS-59132 were further confirmed via In-Out PCR and

Southern blot reactions. These assays confirmed that events containing a copy of the AAD-1 transgene integrated within the Corn Event DAS-59132 genomic locus via an NHEJ mechanism.

[0409] DNA extraction: DNA was extracted from lyophilized maize callus tissue using a QIAGEN BIOSPRINT 96™ DNA isolation kit per manufacturer's recommendations. A pre-defined program was used for the automation extraction and DNA was eluted in 200 μΐ of 1 :1 TE Buffer/distilled water. Two microliters (2 μί) of each sample was quantified on THERMO SCIENTIFIC NANODROP 8000™ and samples were normalized to 100 ng μΕ using QIAGEN BIOROBOT

3000™. Normalized DNA was stored at 4°C until further analysis.

[0410] Copy Number Estimation: Transgene copy number determination by hydrolysis probe assay, analogous to TAQMAN® assay, was performed by real-time PCR using the LIGHTC YCLER®480 system (Roche Applied Science, Indianapolis, IN). Assays were designed for AAD-1 and the internal reference gene Invertase using LIGHTCYCLER® Probe Design Software 2.0. For amplification,

LIGHTCYCLER®480 Probes Master mix (Roche Applied Science, Indianapolis, IN) was prepared at IX final concentration in a 10 μΐ, volume multiplex reaction containing 0.4 μΜ of each primer and 0.2 μΜ of each probe (Table 29). A two-step amplification reaction was performed with an extension at 60°C for 40 seconds with fluorescence acquisition. Analysis of real time PCR copy number data was performed using LIGHTCYCLER® software release 1.5 using the relative quant module and is based on the ΔΔΟ: method. For this, a sample of gDNA from a single copy calibrator and a known two-copy check were included in each run.

Table 29: Primer/Probe Sequences for hydrolysis probe assay of AAD1 and

internal reference (Inv).

AAAGTTTGGAGGCTGCCGT 3'

SEQ ID NO:459 ; 5'

CGAGCAGACCGCCGTGTACTTCTACC

IV-Probe 3' HEX

Corn Event DAS-59132 Genomic Locus Disruption Assay:

[0411] A genomic locus disruption assay for Corn Event DAS-59132 was performed by real-time PCR using the LIGHTCYCLER®480 system (Roche Applied Science, Indianapolis, IN). Assays were designed to monitor the specificity for which Event 32 ZFN6 (25716/25717) bound and cleaved genomic sequences of the Corn Event DAS-59132 locus and the internal reference gene IVF using the

LIGHTCYCLER® Probe Design Software 2.0. For amplification,

LIGHTC YCLER®480 Probes Master mix (Roche Applied Science, Indianapolis, IN) was prepared at IX final concentration in a 10 μΕ volume multiplex reaction containing 0.4 μΜ of each primer and 0.2 μΜ of each probe (Table 30). A two-step amplification reaction was performed with an extension at 55°C for 30 seconds with fluorescence acquisition. Analysis for the disruption assay was performed using target to reference ratio (Figure 37). Four of the eight events were identified as containing an AAD-1 transgene integrated into the genomic locus of Corn Event DAS-59132. The following events, consisting of; Event 100655/105906[1]-001, Event 100655/105906[5]-013, Event 100655/105906[5]-015, and Event

100655/105906[3]-018, were advance for further molecular analysis to confirm the integration of the AAD-1 transgene within the genomic locus of Corn Event DAS- 59132.

Corn Event DAS-59132 Locus Specific In-Out PCR:

[0412] The insertion of the AAD-1 donor within the genomic locus of Corn Event DAS-59132_via NHEJ can occur in one of two orientations. The integration of the AAD-1 transgene and the orientation of this integration were confirmed with an In-Out PCR assay. The In-Out PCR assay utilizes an "Out" primer that was designed to bind to the genomic locus of Com Event DAS-59132 target sequence. In addition, an "In" primer was designed to bind to the AAD-1 donor sequence. The amplification reactions which were completed using these primers only amplify a donor gene which is inserted at the target site. The resulting PCR amplicon was produced from the two primers, and consisted of a sequence that spanned the junction of the insertion. For each sample, two sets of In-Out PCR primers were multiplexed into one reaction and were used to detect the NHEJ-mediated donor insertion which could occur in one of two different orientations. Positive and negative controls were included in the assay. Two positive control plasmids, pDAB 100664 and pD AB 100665 , were constructed to simulate donor insertion at the genomic locus of Corn Event DAS-59132 in one of the two different orientations.

[0413] A DNA intercalating dye, SYTO-13, was used in the PCR mix in order to detect amplification in real time on a thermocycler with fluorescence detection capability. In addition, a melting temperature (Tm) analysis program was attached to a regular PCR program so the amplified products could be analyzed for their Tm profiles. The similarity in the Tm profile between an unknown sample and the positive control sample strongly suggests that the unknown sample has the same amplified product as that of the positive control. The PCR reactions were conducted using 10 ng of template genomic DNA, 0.2 μΜ dNTPs, 0.2 μΜ forward and reverse primers, 4 μΜ SYTO-13 and 0.15 μΐ of Ex Taq HS. Reactions were completed in two steps: the first step consisted of one cycle at 94°C (2 minutes) and 35 cycles at 98°C (12 seconds), 66°C (30 seconds) and 68°C (1.3 minutes); the second step was a Tm program covering 60-95°C followed by 65°C (30 seconds) and 72°C (10 minutes) (Table 30). The amplicons were sent out for sequencing to confirm that the AAD-1 gene had integrated within the genomic locus of Corn Event DAS-59132.

[0414] The results of the real-time, In-Out PCR amplicons were visualized using the ABI software. These results were further confirmed using a gel shift assay, wherein the amplicons were run on a 1.2% TAE gel. Expected amplicon sizes were ~ 1.8kb for the first orientation (as in pD AB 100664) and ~2kb for the second orientation (as in pDAB 100665). The gel shift assay results confirmed the real-time, In-Out PCR reaction data. Both sets of data suggested that a copy of the AAD-1 transgene had integrated via NHEJ within the genome of the maize cells at the genomic locus of Corn Event DAS-59132. Table 30: Primers for In-Out PCR to detect NHEJ mediated targeting at Corn

Event DAS-59132 in maize cells.

[0415] Southern Blot Analysis: The maize callus events identified above were further screened using a Southern blot assay. This assay was used to further confirm that the AAD-1 transgene had integrated via NHEJ within the genome of the maize cells at the genomic locus of Corn Event DAS-59132. The Southern blot analysis experiments generated data which demonstrated the integration and integrity of the AAD-1 transgene within the soybean genome.

[0416] DNA Extraction: Genomic DNA was extracted from the callus tissue harvested from each individual event. Initially, the tissue samples were collected in 2 mL tubes and lyophilized for 2 days. Tissue maceration was performed with a KLECO TISSUE PULVERIZER™ and tungsten beads (Kleco, Visalia, CA).

Following tissue maceration the genomic DNA was isolated using the DNEASY PLANT MINI KIT™ (Qiagen, Germantown, MD) according to the manufacturer's suggested protocol.

[0417] Southern Blot: Genomic DNA (gDNA) was quantified using the

QUANT-IT PICO GREEN DNA ASSAY KIT™ (Molecular Probes, Invitrogen, Carlsbad, CA). Quantified gDNA was adjusted to 4 μg for the Southern blot analysis. DNA samples were then digested using the Ncol restriction enzyme (New England BioLabs, Ipswich, MA) overnight at 37 °C followed with a clean-up using QUICK- PRECIP™ (Edge BioSystem, Gaithersburg, MD) according to the manufacturer's suggested protocol. DNA was resuspended in IX dye and electrophoresed for 5 hours on a .8% SEAKEM LE AGAROSE GEL™ (Lonza, Rockland, ME) at 110 volts in a cold room. The gel was denatured, neutralized, and then transferred to a nylon charged membrane (Millipore, Bedford, MA) overnight and DNA was crosslinked to the membrane using the UV STRATA LINKER 1800™ (Stratagene, La Jolla, CA), and blots were prehybridized with 20 mL of PERFECTHYB PLUS™ (Sigma, St. Louis, MO). The 226 bp probe (SEQ ID NO:464) was labeled using PRIME-IT RMT RANDOM™ (Stratagene, La Jolla, CA) according to manufacturer's suggested protocol and purified using the PROBE QUANT G-50 MICRO COLUMNS™ (GE Healthcare, Buckinghamshire, UK) per manufacturer's suggested protocol.

Approximately, 20,000,000 cpm of the labeled probe was added to the blots and incubated overnight. Blots were washed twice for 15 minutes per wash and placed on a phosphor image screen for 24 hours and analyzed by a STORM 860 SCANNER™ (Molecular Dynamics).

[0418] Expected and observed fragment sizes with the Ncol digest and probe, based on the known restriction enzyme sites AAD-1 and Corn Event DAS-59132 resulted from the Southern blots. Two DNA fragments were identified from these digests and hybridizations. Southern blots which produced results with bands at sizes of around 2.9 and 5.5 kb indicated that the AAD-1 transgene had integrated into the genomic locus of Corn Event DAS-59132 via an NHEJ mechanism.

Example 9: Targeted integration and disruption of corn Engineered Landing Pad

Characterization of ELP genomic target sequence

[0419] The genomic locus in which an Engineered Landing Pad (ELP) integrated was selected as an endogenous genomic loci for gene targeting. The construction of the ELP sequences which comprise Zinc Finger binding sites (eZFNl and eZFN3) and about 1.0 kb of random artificial sequences which flank the Zinc

Finger binding sites, in addition to the Zinc Finger Nuclease proteins are described in International Patent Application WO2011091317, herein incorporated by reference in its entirety. To test NHEJ-mediated targeted integration within the ELP loci, two donor DNAs were constructed, both of which contain one of the two eZFN binding sites in a 5.3 kb plasmid comprising an aad-1 gene which confers resistance to the herbicide haloxyfop. Figure 38 presents a representative schematic of the integration. Regeneration of transgenic plant events comprising an ELP

[0420] Four transgenic ELP events produced from the transformation of pDAB100640, pDAB100641 and pDAB106685 (two for each ELP) were generated as described in International Patent Application WO2011091317, herein incorporated by reference in its entirety. The events were obtained and confirmed to be single copy and contain an intact PTU comprising the ELP. These events were regenerated to produce donor material for targeting. Healthy growing tissue was transferred first to 28(1H) (MS medium (Murashige and Skoog (1962) Physiol Plant 15:473-497), 0.025 mg/L 2,-4D, 5 mg/L BAP, 1.0 mg/L Herbiace,30 g/L sucrose, 2.5 g/L gelrite, pH 5.7) and incubated in low light (14 μΕ/m sec 16 hr photoperiod) for 7 days followed by high light (89

16 hr photoperiod) for another 7 days. Greening structures were transferred to 36(1H) which is the same as 28(1H) minus the BAP and 2,4-D and incubated in high light (40 μΕ/m sec 16 hr photoperiod) until shoot structures developed sufficient roots for transplanting to greenhouse. Plants were grown to maturity in greenhouse using mix of 95% METRO-MIX 360® and 5% clay/loam soil and pollinated dependent on health of plant. Vigorously growing plants were selfed or sibbed (plants from same event) and less vigorous plants were crossed with Hi-II or A188 to maintain embryogenic capacity of donor material. Ti seed was planted in 4" pots and germinating seedlings were screened for zygosity via qPCR. Seedlings determined to be homozygous for the PAT gene were transferred to 5 gallon pots, grown to reproductive stage, outcrossed to Hi-II and T₂ embryos used for targeting via NHEJ mediated integration.

NHEJ targeting of ELP protoplasts

[0421] Zea mays Hi-II protoplasts were obtained and transformed using the previously described protoplast transformation protocol. Donor plasmid DNA of pDAB100651 was transformed with the ZFN plasmid DNA of pDAB105941.

Likewise, donor plasmid DNA of pDAB 100652 was transformed with the ZFN plasmid DNA of pD AB 105943. The donor DNAs were transformed with Zinc Finger Nucleases into the ELP transgenic plants. Upon the introduction of the donor DNA and the eZFNs, both the donor DNA and the ELP loci with is integrated within the genomic target DNA were cleaved. The donor DNA was subsequently inserted into the genomic target. The insertion of the donor DNA within the ELP genomic loci can occur in either orientation. Insertion of the donor DNA in the direct orientation will result in a junction sequence that corresponds to the expected annealing and ligation of the 4 bp single-stranded complementary ends generated from ZFN cleavage. Insertions and deletions (indels) at the junctions are common. Insertion of the donor DNA in the reverse orientation will result in both junction fractions that contain indels.

[0422] Protoplast Isolation: Maize Hi-II embryogenic suspension cultures were obtained and were maintained on a 3.5 day maintenance schedule. In a 50 mL sterile conical tube a 10 mL solution of sterile 6% (w/v) cellulase and a 10 mL solution of sterile 0.6% (w/v) pectolyase enzyme solutions were pipette into the conical tube using a 10 mL pipette tip. Next, 4 pack cell volumes (PCV) of Hi-II suspension cells were added into the 50 mL tube containing the digest solution and wrapped with parafilm. The tubes were placed on a platform rocker overnight at room temperature for -16-18 hrs. The next morning, the tubes were removed from the shaker. In a sterile 50 mL conical tube the cells and enzyme solution were slowly filtered through a 100 μπι cell strainer. Next, the cells were rinsed using a 100 μπι cell strainer by pipetting 10 mLs of W5 media through the strainer. In a sterile 50 mL conical tube, the cells and enzyme solution were slowly filtered through a 70 μιη cell strainer. This straining step was followed by a second straining step, wherein the cells and enzyme solution were slowly strained into a 50 mL conical tube through a 40 μπι cell strainer. Using a 10 mL pipette tip, the 40 μηι cell strainer was rinsed with 10 mL of W5 media to give a final volume of 40 mL and the tube was inverted. Very slowly, 8 mL of sucrose cushion was added to the bottom of the protoplast/enzyme solution. Using a centrifuge with a swing arm bucket rotor, the tubes were spun for 15 minutes at 1500 rpm. The protoplast cells were removed using a 5 mL narrow bore pipette tip. These cells (7-8 mLs) which were observed as a protoplast bane were removed very slowly and put into a sterile 50 mL conical tube. Next, 25 mL of W5 media was used to wash the tubes. The W5 media was added and the tubes were inverted slowly and centrifuge for 10 minutes at 1500 rpm. The supernatant was removed and 1 OmL of MMG solution was added with slow inversion of the tube to resuspend protoplast pellet. The density of protoplasts were determined using a haemocytometer, the 4 PCV yields -30 million protoplasts. [0423] Protoplast Transformation: The protoplast cells were diluted to 1.6 million protoplasts per ml using an MMG solution. The protoplasts were gently resuspended by slowly inverting the tube. Next, 300 of protoplasts (~500k protoplasts) were added to a sterile 2 mL tube, the tubes were inverted to evenly distribute the protoplast cells. Plasmid DNA of a concentration about 80 μg was suspended in TE buffer was added to the protoplasts. Both the ZFN and donor plasmid constructs were transformed. The eZFN expressing plasmids (pDAB 105941 and pDAB105943) were added alone or in combination at a 1 :1 or 10:1 ratio of Donor DNA (pDAB 100651 and pDAB 100652) to eZFN DNA for each of the eZFNl and eZFN3 treatments. The efficacy of the eZFNs had previously been tested and are described in more detail in International Patent Application WO2011091317, Figure 39 is provided as a representation of the eZFN relative to the donor DNA. The tubes were slowly rolled to suspend the DNA with the protoplasts and the tubes were incubated for 5-10 minutes at room temperature. Next 300 μΐ, of PEG solution was added to the protoplast/DNA solution. Once all the PEG solution had been added, the PEG solution was mixed with the protoplast solution by gently inverting the tube. The cocktail was incubated at room temperature for 15-20 minutes with periodic inverting of the tube(s). After the incubation, 1 mL of W5 solution was slowly added to the tubes and the tubes were gently inverted. Finally, the solution was centrifuged at 1000 rpm for 15 minutes. The supernatant was carefully removed so as not to disturb the cell pellet. Finally, 1 mL of washing/incubating solution was added. The tubes were gently inverted to resuspend the cell pellet. The tubes were covered with aluminum foil to eliminate any exposure to light, and were laid on a rack on their side to incubate overnight. The cells were harvested 24 hours post-transformation for molecular analysis to identify ELP loci which contained a donor integrated via NHEJ- mediated DNA repair.

Table 31: DNA concentrations and of the eZFN and Donor plasmid DNA that were transformed into the maize cells and integrated within the ELP loci.

eZFNl Donor

+ eZFNl

(10:1) 100651 40 105941 4 36 80

— — — — — — — eZFN3 Donor

alone 100652 40 40 80 eZFN3 alone — — 105943 40 40 80 eZFN3 Donor

+ eZFN3 (1 :1) 100652 40 105943 40 0 80 eZFN3 Donor

+ eZFN3

(10:1) 100652 40 105943 4 36 80

Molecular confirmation of maize ELP targeting by NHEJ in protoplasts

[0424] DNA extraction: DNA was extracted from maize tissue using a Qiagen

BIOSPRINT 96™ robot via automation and DNA was eluted in 200 μΐ of 1 :1 TE Buffer/distilled water. DNA of each sample was quantified on

THERMO SCIENTIFIC NANODROP 8000™ and samples were normalized to 100 ng/VL using QIAGEN BIOROBOT 3000™. Normalized DNA was stored at 4°C till further analysis.

[0425] ELP locus disruption assay: After harvesting the protoplasts at 24 hr, DNA was extracted and analyzed using a disruption assay, junction analysis using PCR and sequencing of the DNAs produced by the junction PCR. The disruption assay is an indirect measure of relative eZFN cleavage activity. The TAQMAN™- based assay determines the loss of intact eZFN binding sites in the target DNA as would be expected due to misrepair of the DNA ends that can occur upon ligation of the ends.

[0426] The data from the TAQMAN™-based assay suggests that eZFNl has a higher activity than does eZFN3 and also that there is significant cleavage of the target DNA by the eZFNs as demonstrated by the statistically significant reduction in the signal in the eZFN samples compared to the donor alone samples.

[0427] Cleavage of genomic target DNA by eZFNs. DNA was isolated from each treatment group (6 replicates each) as indicated. Taqman assays were used to measure cleavage of the target DNA. ELP locus disruption assay was performed by real-time PCR using the LIGHTCYCLER®480 system (Roche Applied Science, Indianapolis, IN). Assays were designed to monitor eZFNl and eZFN3 binding sequences within ELP 1 and the internal reference gene TVF using LIGHTCYCLER® Probe Design Software 2.0. For amplification, LIGHTCYCLER®480 Probes Master mix (Roche Applied Science, Indianapolis, IN) was prepared at IX final

concentration in a 10 iL volume multiplex reaction containing 0.4 μΜ of each primer and 0.2 μΜ of each probe (Table 33). A two-step amplification reaction was performed with an extension at 55°C for 30 seconds with fluorescence acquisition. Analysis for the disruption assay was performed using target to reference ratio. The location of the primers are shown in Figure 40.

Table 33: Primer and probe sequences for the disruption assay

[0428] Locus Specific In-Out PCR: Donor insertion at the expected ELP site using NHEJ-mediated repair can result in two different orientations. Two positive control plasmids, pDAB 100660 and pDAB 100662 were constructed and transformed into plants to simulate donor insertion within the ELP site. Transgenic plants which were produced with the pDAB 100660 and pDAB 100662 control constructs were assayed using an In-Out PCR design and the conditions for the In-Out PCR assay were determined on these plant materials.

[0429] The In-Out PCR assay has an "Out" primer in the ELP sequence while an "In" primer is placed in the donor sequence, so that only when the donor is inserted in the target site, the two primers would amplify a sequence that spans the junction of the insertion (Table 34). For each sample, two In-Out PCR reactions were used to detect donor insertion with two different orientations. Positive and negative controls were included in the assay. [0430] A DNA intercalating dye, SYTO-13, was used in the PC mix in order to detect amplification in real-time on a thermocycler with fluorescence detection capability. In addition, a melting temperature (Tm) analysis program was attached to a regular PCR program so the amplified products can be analyzed for their Tm profiles. Similarity of the Tm profile between an unknown sample and the positive control strongly suggests the unknown sample has the same amplified product as that of the positive control. Positive targeted samples identified in the real-time, In-Out PCR assay were further visualized using gel shift analysis. Expected amplicon sizes are ~1.5kb for Orientation 1 (as in pDAB 100662), and ~1.4kb for Orientation 2 (as in pDAB 100660).

Table 34: Primers for In-Out PCR to detect NHEJ mediated targeting at ELP1 in maize

[0431] PCR reactions were conducted using 10 ng of template genomic DNA, 0.2 μΜ dNTPs, 0.2 μΜ forward and reverse primers, 4 μΜ SYTO-13 and 0.15 μΐ of Ex Taq HS. Reactions were completed in two steps: the first step consisted of one cycle at 94°C (2 minutes) and 35 cycles at 98°C (12 seconds), 66°C (30 seconds) and 68°C (1.3 minutes); the second step was a Tm program covering 60-95°C followed by 65°C (30 seconds) and 72°C (10 minutes). Products were visualized using the ABI software as well as by running on a 1.2% TAE gel.

[0432] The In-out PCR junction analysis for eZFNl and eZFN3 of NHEJ- directed targeting of donor treatments that included donor DNA alone, eZFN alone, or Donor DNA with eZFN DNA (at a ratio of either 1 : 1 or 10 : 1 ) were run out on agarose gels. The results indicated that the PCR amplicon size of the donor and eZFN DNA was that expected for an NHEJ targeted event.

[0433] Sequence of Target/donor Junctions: From the ELP targeted events which were confirmed via in-out PCR analysis, the PCR amplicon products were confirmed via sequencing and the target-donor junctions were validated by standard Sanger sequencing. Briefly, junction PCR analysis was performed on all replicates of each treatment group. PCR primers were chosen to amplify one side of the insert junction sequences that were either in the direct or reverse orientation. PCR products were observed in samples generated from the eZFN and Donor DNA samples, but not from the control samples, comprising the Donor DNA alone or eZFN alone samples. PCR products were evident in the majority of replicate samples from both ratios of eZFNs and Donor DNAs used.

[0434] Representative samples of the PCR products were cloned and sequenced. For both the direct and reverse orientation, sequences of the PCR products from four different reactions are shown in Figure 41. Nine unique haplotypes were observed for the direct orientation of the insert, as expected from misrepair to the junction ends. Three of the 16 sequences aligned with the sequence expected from annealing and ligation of intact ends of the inserted Donor DNA and the target sequence. All sequences of the PCR products in the reverse orientation had indels at the junctions as expected since the single-stranded ends of the Donor DNA and target DNA are not complementary.

Results of NHEJ-Mediated Donor Targeting in Maize Protoplasts

[0435] A maize protoplast-based transient assay system was developed that showed high, reproducible expression of reporter genes. Protoplasts were derived from a Hi-II maize suspension culture that was developed at DAS. A transgenic line of maize (maize line 106685[l]-007) that harbored an insert of the ELP was used for NHEJ mediated integration of the donor DNA sequence.

NHEJ targeting of ELP embryos via microparticle bombardment

[0436] Three days prior to microparticle bombardment, 1.5-2.2 mm embryos were isolated from surface sterilized ears and placed (scutellum-up) onto N6 basal medium and vitamins (Phytotechnology Laboratories, Shawnee Mission, KS) with 2.0 mg/L 2, 4-D, 2.8 g/L proline, 30 g/L sucrose, 100 mg/L casein enzymatic hydrolysate, 100 mg/L myo-inositol and 4.25 mg/L silver nitrate solidified with 2.5 g/L Gelzan (Phytotechnology Laboratories, Shawnee Mission, KS). Four hours prior to microparticle bombardment, -35-40 embryos were placed (scutellum up) onto in the center of a 100 x 15 mm Petri dish containing the same medium with the addition of 36.4 g/L sorbitol and 36.4 g/L mannitol.

[0437] Microparticle gold (0.6 micron, BioRad, Hercules, CA,) was prepared for DNA precipitation by weighing out 15 mg into a sterile, siliconized 1.7 mL microcentrifuge tube (Sigma- Aldrich, St. Louis, MO) and 500 μΐ, of ice cold 100% ethanol was slowly added. After a 15 second sonication in an FS-14 ultrasonic water bath (Fisher Scientific, Nazareth, PA), the gold was allowed to settle for 30 minutes at room temperature prior to centrifugation at 3,000 rpm for 60 seconds using a

MiniSpin (Eppendorf, Hauppauge, NY). After removing the supernatant, 1 mL of ice cold, sterile water was added, mixed up and down with the pipette and allowed to settle for 3-5 minutes prior to centrifugation at 3,000 for 60 seconds. The wash step was repeated one more time prior to suspending the gold in 500

of ice cold, sterile water. The washed gold was then aliquoted into separate 1.7 mL sterile, siliconized microcentrifuge tubes (50 per tube) being careful to keep the gold well mixed by pipeting up and down between tubes. The washed gold (~1.5 mg per 50 was then stored at -20°C until needed.

[0438] For DNA precipitation, one tube containing ~1.5 mg of gold in 50 iL of water was thawed for each 10 targets to be bombarded and sonicated in an ultrasonic water bath for 15 seconds then placed on ice. Plasmid DNA (0.5 μg ZFN + 4.5 μg Donor) was premixed in 0.6 mL microcentrifuge tubes (Fisher Scientific,

Nazareth, PA) and added to the gold suspension gently pipeting up and down several times to mix thoroughly. Fifty μί of ice cold 2.5 M calcium chloride was added and gently mixed by pipeting up and down several times. Twenty iL of cold 0.1 M spermidine was then added and gently mixed by pipeting up and down several times. The tube was then capped and placed onto a Vortex Genie 2 (Scientific Instruments Inc., Bohemia, NY) and allowed to mix (set at 'shake 2') for 10 minutes after which the mixture was allowed to settle for 3-5 minutes. After centrifuging for 15 seconds at 5,000 rpm, the supernatant was carefully removed and 250 iL of ice cold, 100% ethanol was added, the tube capped and mixed vigorously by hand to dislodge the pellet. After a second centrifuge for 15 seconds at 5,000 rpm, 120 of ice cold, 100%) ethanol was added, the tube capped and mixed vigorously by hand to dislodge the pellet. [0439] For microparticle bombardment, sterilized macrocarriers (BioRad,

Hercules, CA) were fit into stainless steel holders (BioRad, Hercules, CA) and autoclaved. Ten of gold/DNA suspension was evenly spread in the center of the macrocarrier being sure to pipette up and down so as to keep well mixed then placed onto a piece of sterile 125 mm Whatman #4 filter paper (GE Healthcare,

Buckinghamshire, UK) on a bed of 8-mesh Drierite (W.A Hammond Drierite Co., Xenia, OH) in a 140 x 25 mm glass Petri dish. The gold/DNA was allowed to dry completely for about 10 minutes. Rupture discs (650 psi, BioRad, Hercules, CA) were sterilized by soaking for a few minutes in isopropyl alcohol then loaded into the retaining cap of a microparticle bombardment devise (PDS-1000, BioRad, Hercules, CA). An autoclaved stopping screen (BioRad, Hercules, CA) and a loaded macrocarrier was placed into the launch assembly, the lid was screwed on and slide into the bombardment chamber just under the nozzle. The Petri dish containing the screen-covered, leaf target was uncovered and placed in the bombardment chamber 6 cm below the nozzle. A vacuum was pulled (-0.9 bar) and the devise was fired.

[0440] Next day (16 - 20 hours after bombardment), the bombarded embryos were transferred (scutellum-up) to N6 basal medium and vitamins (Phytotechnology Laboratories, Shawnee Mission, KS) with 2.0 mg/L 2, 4-D, 2.8 g/L proline, 30 g/L sucrose, 100 mg/L casein enzymatic hydrolysate, 100 mg/L myo-inositol and 4.25 mg/L silver nitrate solidified with 2.5 g/L Gelzan (Phytotechnology Laboratories, Shawnee Mission, KS). After 7 days - 8 days-post-bombardment, 11 days from culture initiation - embryos were transferred to selection media N6 basal medium and vitamins (Phytotechnology Laboratories, Shawnee Mission, KS) with 2.0 mg/L 2, 4- D, 2.8 g/L proline, 30 g/L sucrose, 100 mg/L casein enzymatic hydrolysate, 100 mg/L myo-inositol, 4.25 mg/L silver nitrate and 0.0362 mg/L R-haloxyfop acid solidified with 2.5 g/L Gelzan (Phytotechnology Laboratories, Shawnee Mission, KS). After two weeks, embryos were transferred to fresh selection N6 basal medium and vitamins (Phytotechnology Laboratories, Shawnee Mission, KS) with 2.0 mg/L 2, 4- D, 2.8 g/L proline, 30 g/L sucrose, 100 mg/L casein enzymatic hydrolysate, 100 mg/L myo-inositol, 4.25 mg/L silver nitrate and 0.181 mg/L R-haloxyfop acid solidified with 2.5 g/L Gelzan (Phytotechnology Laboratories, Shawnee Mission, KS) and after an additional two weeks they were transferred to fresh medium of the same composition. [0441] Callus growing on 0.181 mg/L R-haloxyfop was sampled for molecular analysis. Sampling involved placing either -50 mg into 1.2 mL cluster tubes for PCR analysis or -200 mg into 2.0 mL Safe Lock tubes (Eppendorf, Hauppauge, NY) for Southern blot analysis surrounded by dry ice for rapid freezing. The tubes were then covered in 3M micropore tape (Fisher Scientific, Nazareth, PA) and lyophilized for 48 hours in a Virtual XL-70 (VirTis, Gardiner, NY). Once the tissue was lyophilized, the tubes were capped and stored at 8°C until analysis.

Molecular confirmation of maize ELP targeting by NHEJ in plants

[0442] DNA extraction: DNA was extracted from maize tissue using a Qiagen

BIOSPRINT 96™ robot via automation and DNA was eluted in 200 μΐ of 1 :1 TE Buffer/distilled water. Two of each sample was quantified on

THERMOSCIENTIFIC NANODROP 8000™ and samples were normalized to 100 using QIAGEN BIOROBOT 3000™. Normalized DNA was stored at 4°C till further analysis.

[0443] Copy Number Estimation: Transgene copy number determination by hydrolysis probe assay, analogous to TAQMAN® assay, was performed by real-time PCR using the LIGHTCYCLER®480 system (Roche Applied Science, Indianapolis, IN). Assays were designed for the aad-1 transgene and the internal reference gene Invertase using LIGHTCYCLER® PROBE DESIGN SOFTWARE 2.0. For amplification, LIGHTCYCLER®480 Probes Master mix (Roche Applied Science, Indianapolis, IN) was prepared at IX final concentration in a 10 ih volume multiplex reaction containing 0.4 μΜ of each primer and 0.2 μΜ of each probe (Table 32). A two-step amplification reaction was performed with an extension at 60°C for 40 seconds with fluorescence acquisition. Analysis of real time PCR copy number data was performed using LIGHTCYCLER® software release 1.5 using the relative quant module and is based on the AACt method. For this, a sample of gDN A from a single copy calibrator and known 2 copy check were included in each run. The location of the primers are shown in Figure 41.

Table 32: Primer/Probe Sequences for hydrolysis probe assay of AAD1 and

internal reference (Inv)

GAAD1R

SEQ ID NO:466 ; 5' CAACATCCATCACCTTGACTGA 3 ' —

GAAD1R

SEQ ID NO:467 ; 5' CACAGAACCGTCGCTTCAGCAACA 3' FAM

IVF-Taq

SEQ ID NO:468 ; 5' TGGCGGACGACGACTTGT 3' —

IVR-Taq

SEQ ID NO:469 ; 5' AAAGTTTGGAGGCTGCCGT 3' —

SEQ ID NO:470 ; 5' CGAGCAGACCGCCGTGTACTTCTACC 3 ' HEX

IV-Probe

[0444] ELP locus disruption assay: The ELP locus disruption assay was performed by real-time PCR using the LIGHTCYCLER®480 system (Roche Applied Science, Indianapolis, IN) as previously described above. Assays were designed to monitor eZFNl and eZFN3 binding sequences within ELPl and the internal reference gene IVF using LIGHTCYCLER® Probe Design Software 2.0. Analysis for the disruption assay was performed using target to reference ratio. The results are shown in Figure 42.

[0445] Locus Specific In-Out PCR: The In-Out PCR assay was completed as previously described above. The In-out PCR junction analysis for eZFNl and eZFN3 of NHEJ-directed targeting of donor treatments that included donor DNA alone, eZFN alone, or Donor DNA with eZFN DNA (at a ratio of either 1 :1 or 10:1) were run out on agarose gels. The results indicated that the PCR amplicon size of the donor and eZFN DNA was that expected for an NHEJ targeted event.

[0446] Sequence of Target/donor Junctions: From the ELP targeted events which were confirmed via in-out PCR analysis, the PCR amplicon products were confirmed via sequencing and the target-donor junctions were validated by standard Sanger sequencing. Briefly, junction PCR analysis was performed on all replicates of each treatment group. PCR primers were chosen to amplify one side of the insert junction sequences that were either in the direct or reverse orientation. PCR products were observed in samples generated from the eZFN and Donor DNA samples, but not from the control samples, comprising the Donor DNA alone or eZFN alone samples. PCR products were evident in the majority of replicate samples from both ratios of eZFNs and Donor DNAs used.

[0447] Representative samples of the PCR products were cloned and sequenced. For both the direct and reverse orientation, sequences of the PCR products from four different reactions are shown in Figure 43. Nine unique haplotypes were observed for the direct orientation of the insert, as expected from misrepair to the junction ends. Three of the 16 sequences aligned with the sequence expected from annealing and ligation of intact ends of the inserted Donor DNA and the target sequence. All sequences of the PCR products in the reverse orientation had indels at the junctions as expected since the single-stranded ends of the Donor DNA and target DNA are not complementary.

Southern Blot Analysis: Maize callus, that were initially identified as containing a donor sequence integrated within the ELP locus via the locus specific disruption assay and the in-out PCR assay was selected for further analysis by Targeted

Integration (TI) Southern blots. For T₁ Southerns, DNA was digested and probed with enzymes and probes at the target locus. For DNA extraction for Southerns, tissue samples were collected in 2ml eppendorf tubes (Eppendorf) and lyophilized for 2 days. Tissue maceration was performed with a Kleco tissue pulverizer and tungsten beads {Kleco, Visalia, CA). Following tissue maceration the genomic DNA was isolated using the DNEAS Y PLANT MINI KIT™ (Qiagen, Germantown, MD) according to the manufacturer's suggested protocol.

[0448] Genomic DNA (gDNA) was quantified by QUANT-IT PICO GREEN

DNA™ assay kit (Molecular Probes, Invitrogen, Carlsbad, CA). Quantified gDNA was adjusted to 4 μg for the Southern blot analysis. DNA samples were then digested using the Pmel restriction enzyme (New England BioLabs) overnight at 37 °C followed with a clean-up using QUICK-PRECIP™ (Edge BioSystem, Gaithersburg, MD) according to the manufacturer's suggested protocol. DNA was then resuspended in IX dye and electrophoresed for 5 hours on a .8% SEAKEM LE agarose gel (Lonza, Rockland, ME) at 110 volts in a cold room. The gel was denatured, neutralized, and then transferred to a nylon charged membrane (Millipore, Bedford, MA) overnight and DNA was crosslinked to the membrane using the UV STRATA LINKER 1800™ (Stratagene, La Jolla, CA), and blots were pre-hybridized with 20 ml of

PERFECTHYB PLUS™ (Sigma, St. Louis, MO). The probe was labeled using PRIME-IT RMT™ random (Stratagene, La Jolla, CA) according to manufacturer's suggested protocol and purified using PROBE QUANT G-50 MICRO COLUMNS™ (GE Healthcare, Buckinghamshire, UK) according to manufacturer's suggested protocol. Approximately 20,000,000 cpm of the labeled probed was added to the blots and incubated overnight. Blots were washed 2 xl5 minutes per wash, placed on a phosphor image screen for 24 hours and analyzed by a STORM 860 SCANNER™ (Molecular Dynamics). Results showed expected bands for targeted integration (-6.9 kB).

Results of ELP loci targeting via NHE J-mediated integration

[0449] The results of this study demonstrate precision insertion in maize of a donor DNA plasmid by NHEJ, subsequent to in vivo, ZFN-generated cleavage of the target DNA and the Donor DNA. Targeting of Donor DNAs occurred using two different donor DNAs (different by which eZFN binding sequences were contained in the ELP) and two different eZFNs within protoplasts and maize embryos. Integration within the ELP loci via an NHEJ repair mechanism occurred in both orientations. The donor DNA insert was detected in both of these orientations in the samples tested.

[0450] Precision targeting of genes using the NHEJ repair mechanism in the tissues of differing plant species provides significant advantage over known repair mechanisms such as homologous recombination mediated repair. The NHEJ repair mechanism is the dominate repair mechanism which operates in most if not all plant tissues. Conversely, the activity of the homologous recombination mediated repair pathway operates throughout the G2 phase of the cell cycle in plants. Resultingly, many plant tissues capable of transformation do not actively undergo cell division and, hence, would not support gene targeting by a homologous recombination mediated repair pathway. Another advantage of NHEJ repair mediated pathway for donor insertion within the genome of plants, is that unlike the homologous recombination mediated repair pathway, the NHEJ-mediated repair does not require extensive regions of homology, which reduces the size of the donor DNA sequences necessary for genomic insertion. Finally, donor DNA sequences of larger sizes can be successfully inserted into genomic loci using NHEJ repair mediated pathway as compared to the homologous recombination mediated repair pathway. Donor-DNA mediated integration within a genomic locus via the homologous recombination mediated repair pathway requires that a DNA polymerases copy the DNA template contained in the Donor DNA. In contrast, the NHEJ repair mediated pathway only requires interaction of the donor DNA with the target DNA at two points (their ends) and does not require template-dependent DNA synthesis. Accordingly the NHEJ repair mediated pathway can be utilized to integrate larger size donor DNA sequences within the targeted genomic locus. Example 10: Exemplary Sequences

[0451] SEQ ID NO:116

TCGCCCAAACCCTCGCCGCCGCCATGGCCGCAGCCACCTCCCCCGCCGTCGCATT CTCGGGCGCCACCGCCGCCGCCATGCCCAAACCCGCCCGCCATCCTCTCCCGCGC CACCAGCCCGTCTCGCGCCGCGCGCTCCCCGCCCGCGTCGTCAGGTGTTGCGCCG CGTCCCCCGCCGCCACCTCCGCCGCGCCTCCCGCAACCGCGCTCCGGCCATGGGG CCCGTCCGAGCCCCGCAAGGGCGCCGACATCCTCGTCGAGGCGCTCGAGCGCTGC GGCATCGTCGACGTCTTCGCCTACCCCGGCGGCGCCTCCATGGAGATCCACCAGG CGCTGACGCGCTCGCCCGTCATCACCAACCACCTCTTCCGCCACGAGCAGGGGGA GGCGTTCGCGGCGTCCGGCTACGCCCGCGCGTCCGGCCGCGTCGGCGTCTGCGTC GCCACCTCCGGCCCGGGGGCCACCAACCTCGTCTCCGCGCTCGCCGACGCCCTCC TCGACTCCATCCCCATGGTCGCCATCACGGGCCAGGTCCCCCGCCGCATGATCGG CACGGACGCGTTCCAGGAGACGCCCATAGTGGAGGTCACGCGCTCCATCACCAA GCACAACTACCTGGTCCTTGACGTGGAGGATATCCCCCGCGTCATCCAGGAAGCC TTCTTCCTTGCATCCTCTGGCCGCCCGGGGCCGGTGCTAGTTGATATCCCCAAGGA CATCCAGCAGCAGATGGCTGTGCCCGTCTGGGACACTCCAATGAGTTTGCCAGGG TACATCGCCCGCCTGCCCAAGCCACCATCTACTGAATCGCTTGAGCAGGTCCTGC GTCTGGTTGGCGAGTCACGGCGCCCAATTCTGTATGTTGGTGGTGGCTGCGCTGC GTCTGGCGAGGAGTTGCGCCGCTTTGTTGAGCTTACTGGGATTCCAGTTACAACT ACTCTGATGGGCCTTGGCAACTTCCCCAGCGACGACCCACTGTCTCTGCGCATGC TTGGGATGCATGGCACTGTGTATGCAAATTATGCAGTAGATAAGGCTGACCTGTT GCTCGCATTTGGTGTGCGGTTTGATGATCGTGTGACTGGGAAAATCGAGGCTTTT GCAAGCAGGTCCAAGATTGAGCACATTGACATTGACCCAGCTGAGATTGGCAGA ACAAGCAGCCACATGTCTCCATTTGTGCAGATGTTAAGCTTGCTTTACAGGGGTT GAATGATCTATTAAATGGGAGCAAAGCACAACAGGGTCTGGATTTTGGTCCATGG CACAAGGAGTTGGATCAGCAGAAGAGGGAGTTTCCTCTAGGATTCAAGACTTTTG GCGAGGCCATCCCGCCGCAATATGCTATCCAGGTACTGGATGAGCTGACAAAAG GGGAGGCGATCATTGCCACTGGTGTTGGGCAGCACCAGATGTGGGCGGCTCAGT ATTACACTTACAAGCGGCCACGGCAGTGGCTGTCTTCGTCTGGTTTGGGGGCAAT GGGATTTGGGTTACCAGCTGCAGCTGGCGCTGCTGTGGCCAACCCAGGTGTTACA GTTGTTGACATTGATGGTGATGGTAGTTTCCTCATGAACATTCAGGAGTTGGCGTT GATCCGCATTGAGAACCTCCCAGTGAAGGTGATGATATTGAACAACCAGCATCTG GGAATGGTGGTGCAGTGGGAGGATAGGTTTTACAAGGCCAATCGGGCGCACACA TACCTTGGCAACCCAGAAAATGAGAGTGAGATATATCCAGATTTTGTGACGATTG CTAAAGGATTCAACGTTCCAGCAGTTCGAGTGACGAAGAAGAGCGAAGTCACTG CAGCAATCAAGAAGATGCTTGAGACCCCAGGGCCATACTTGTTGGATATCATAGT CCCGCATCAGGAGCACGTGCTGCCTATGATCCCAAGCGGTGGTGCTTTCAAGGAC ATGATCATGGAGGGTGATGGCAGGACCTCGTACTGAAATTTCGACCTACAAGACC TACAAGTGTGACATGCGCAATCAGCATGATGCCCGCGTGTTGTATCAACTACTAG GGGTTCAACTGTGAGCCATGCGTTTTCTAGTTTGCTTGTTTCATTCATATAAGCTT GTATTACTTAGTTCCGAACCCTGTAGTTTTGTAGTCTATGTTCTCTTTTGTAGGGA TGTGCTGTCATAAGATGTCATGCAAGTTTCTTGTCCTACATATCAATAATAAGTAC TTCCATGNAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAA

[0452] SEQ ID NO:117

TCGCCCAAACCCTCGCCGCCGCCATGGCCGCAGCCACCTCCCCCGCCGTCGCATT CTCGGGCGCCGCCGCCGCCGCCGCCGCCATACCCAAACCCGCCCGCCAGCCTCTC CCGCGCCACCAGCCCGCCTCGCGCCGCGCGCTCCCCGCCCGCATCGTCAGGTGCT GCGCCGCGTCCCCCGCCGCCACCTCCGTCGCGCCTCCCGCCACCGCGCTCCGGCC GTGGGGCCCCTCCGAGCCCCGCAAGGGCGCCGACATCCTCGTCGAGGCGCTGGA GCGCTGCGGCATCGTCGACGTCTTCGCCTACCCTGGCGGCGCGTCCATGGAGATC CACCAGGCGCTGACGCGCTCGCCAGTCATCACCAACCACCTCTTCCGCCACGAGC AGGGGGAGGCGTTCGCGGCGTCCGGGTACGCCCGCGCGTCCGGCCGCGTCGGCG TCTGCGTCGCCACCTCCGGCCCGGGGGCCACCAACCTCGTCTCCGCGCTCGCCGA CGCTCTCCTCGACTCCATCCCCATGGTCGCCATCACGGGCCAGGTCCCCCGCCGC ATGATCGGCACGGATGCGTTCCAGGAGACGCCCATCGTGGAGGTCACGCGCTCC ATCACCAAGCACAACTACCTGGTCCTTGACGTGGAGGATATCCCCCGCGTCATCC AGGAAGCCTTCTTCCTCGCATCCTCTGGCCGCCCGGGGCCGGTGCTGGTTGATAT CCCCAAGGACATCCAGCAGCAGATGGCTGTGCCTGTCTGGGACACGCCGATGAG TTTGCCAGGGTACATCGCCCGCCTGCCCAAGCCACCATCTACTGAATCGCTTGAG CAGGTCCTGCGTCTGGTTGGCGAGTCACGGCGCCCAATTCTGTATGTTGGTGGTG GCTGCGCTGCATCTGGTGAGGAGTTGCGCCGCTTTGTTGAGCTCACTGGGATTCC AGTTACAACTACTCTTATGGGCCTTGGCAACTTCCCCAGTGACGACCCACTGTCTC TGCGCATGCTGGGGATGCATGGCACTGTGTATGCAAATTATGCAGTAGATAAGGC TGACCTGTTGCTTGCATTTGGTGTGCGGTTTGATGATCGTGTGACCGGGAAAATC GAGGCTTTTGCAAGCAGGTCCAAGATTGAGCACATTGACATTGACCCAGCTGAGA TTGGCAGAACAAGCAGCCACATGTCTCCATTTGTGCAGATGTTAAGCTTGCTTTA CAGGGGTTGAATGCTCTATTAAATGGGAGCAAAGCACAACAGGGTCTGGATTTTG GTCCATGGCACAAGGAGTTGGATCAGCAGAAGAGGGAGTTTCCTCTAGGATTCA AGACTTTTGGTGAGGCCATCCCGCCGCAATATGCTATCCAGGTACTGGATGAGCT GACAAAAGGGGAGGCGATCATTGCCACCGGTGTTGGGCAGCATCAGATGTGGGC GGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTCTTCATCCGGTTTG GGTGCAATGGGATTTGGGTTGCCAGCTGCAGCTGGCGCTGCTGTGGCCAACCCAG GTGTTACAGTTGTTGACATTGATGGGGATGGTAGTTTCCTCATGAACATTCAGGA GTTGGCGTTGATCCGTATTGAGAACCTCCCAGTGAAGGTGATGATATTGAACAAC CAGCATCTGGGAATGGTGGTGCAGTGGGAGGATAGGTTTTACAAGGCCAACCGG GCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGATATATCCAGATTTTG TGACGATTGCTAAAGGATTCAACGTTCCGGCAGTTCGTGTGACGAAGAAGAGCG AAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGGCCATACTTGTTGG ATATCATTGTCCCGCATCAGGAGCACGTGCTGCCTATGATCCCAAGCGGTGGTGC TTTTAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGTACTGAAATTTCGAC CTACAAGACCTACAAGTGTGACATGCGCAATCAGCATGATACCTGCGTGTTGTAT CAACTACTGGGGGTTCAACTGTGAACCATGCGTTTTCTAGTTTGCTTGTTTCATTC ATATAAGCTTGTGTTACTTAGTTCCGAACCGTGTAGTTTTGTAGTCTCTGTTCTCTT TTGTAGGGATGTGCTGTCATAAGATATCATGCAAGTTTCTTGTCCTACATATCAAT AATAAGCACTTCCATGNAANAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAA

[0453] SEQ ID NO:118

TCGCCCAAACCCTCGCCGCCGCCATGGCCGCNGCCACCTCCCCCGCCGTCGCATT CTCGGGCGCCNCCGCCGCCGCCATNCCCAAACCCGCCCGCCANCCTCTCCCGCGC CACCAGCCCGNCTCGCGCCGCGCGCTCCCCGCCCGCNTCGTCAGGTGNTGCGCCG CGTCCCCCGCCGCCACCTCCGCCGCGCCCCCCGCCACCGCGCTCCGGCCCTGGGG CCCGTCCGAGCCCCGCAAGGGCGCCGACATCCTCGTCGAGGCGCTCGAGCGCTGC GGCATCGTCGACGTATTCGCCTACCCCGGCGGCGCGTCCATGGAGATCCACCAGG CGCTGACGCGCTCGCCCGTCATCACCAACCACCTCTTCCGCCACGAGCAGGGGGA GGCGTTCGCGGCGTCCGGCTACGCCCGCGCGTCCGGCCGCGTCGGCGTCTGCGTC GCCACCTCCGGCCCGGGGGCCACCAACCTCGTCTCCGCGCTCGCTGACGCCCTCC TCGACTCCATCCCCATGGTCGCCATCACGGGCCAGGTCCCCCGCCGCATGATCGG CACGGACGCGTTCCAGGAGACGCCCATAGTGGAGGTCACGCGCTCCATCACCAA GCACAACTACCTGGTCCTTGACGTGGAGGATATCCCCCGCGTCATCCAGGAAGCC TTCTTCCTCGCGTCCTCTGGCCGCCCGGGGCCGGTGCTGGTTGATATCCCCAAGG ATATCCAGCAGCAGATGGCCGTGCCTATCTGGGACACGCCGATGAGTTTGCCAGG GTACATCGCCCGCCTGCCCAAGCCACCATCTACTGAATCGCTTGAGCAGGTCCTG CGTCTGGTTGGCGAGTCACGGCGCCCAATTCTGTATGTTGGTGGTGGCTGCGCTG CATCCGGCGAGGAGTTGCGCCGCTTTGTTGAGCTCACTGGGATTCCGGTTACAAC TACTCTGATGGGCCTTGGCAACTTCCCCAGCGACGACCCACTGTCTCTGCGCATG CTTGGGATGCATGGCACTGTGTATGCAAATTATGCAGTCGATAAGGCTGACCTGT TGCTTGCATTTGGTGTGCGGTTTGATGATCGCGTGACTGGGAAAATCGAGGCCTT TGCAAGCAGGTCCAAGATTGAGCACATTGACATTGACCCAGCTGAGATTGGCAG AACAAGCAGCCACATGTCTCCATTTGTGCAGATGTTAAGCTTGCTTTACAGGGGT TGAATGCTCTATTAAATGGGAGCAAAGCACAACAGGGTCTGGATTTTGGTCCATG GCACAAGGAGTTGGATCAGCAGAAGAGGGAGTTTCCTCTAGGATTCAAGACTTTT GGCGAGGCCATCCCGCCGCAATATGCTATCCAGGTACTGGATGAGCTGACAAAA GGGGAGGCGATCATTGCTACTGGTGTTGGGCAGCACCAGATGTGGGCGGCTCAG TATTACACTTACAAGCGGCCACGGCAGTGGCTGTCTTCGTCTGGTTTGGGGGCAA TGGGATTTGGGTTACCAGCTGCAGCTGGCGCTGCTGTGGCCAACCCAGGTGTTAC AGTTGTTGACATTGATGGAGATGGTAGTTTCCTCATGAACATTCAGGAGTTGGCA TTGATCCGTATTGAGAACCTCCCTGTGAAGGTGATGATATTGAACAACCAGCATC TGGGAATGGTGGTGCAATGGGAGGATAGGTTTTACAAGGCCAATCGGGCGCACA CATACCTTGGCAACCCAGAAAATGAGAGTGAGATATATCCAGATTTTGTGACGAT TGCTAAAGGATTCAACGTTCCGGCAGTTCGTGTGACGAAGAAGAGCGAAGTCAC TGCAGCAATCAAGAAGATGCTTGAGACCCCAGGGCCATACTTGTTGGATATCATC GTCCCGCATCAGGAGCACGTGCTGCCTATGATCCCAAGCGGTGGTGCTTTCAAGG ACATGATCATGGAGGGTGATGGCAGGACCTCGTACTGAAATTTCGACCTACAAG ACCTACAAGTGTGACATGCGCAATCAGCATGGTGCCCGCGTGTTGTATCAACTAC TAGGGGTTCAACTGTGAACCATGCGTTTTCTAGTTTGCTTGTTTCATTCATATAAG CTTGTGTTACTTAGTTCCGAACCCTGTAGCTTTGTAGTCTATGCTCTCTTTTGTAGG GATGTGCTGTCATAAGATATCATGCAAGTTTCTTGTCCTACATATCAATAATAAGT ACTTCCATGNAANAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAA [0454] SEQ ID NO:119

CGTTGTGCCTTGGCAGTCTCAGGTTGAGCCCTCACCATTGAAGTAGCATGGGTCA TTGGATTGACCCGATTTGACGGCGGATCTATTGGATCTTCCCTTTGTGTCGTTTTA TACTGGTATAGATGTTTAACACATATTTGGAAAATATATTCAAAACATGTTTCTAT AAAAAAGTTTAAACTATACATGTATAATGGAAGTCATTTATAAGAAATGTTTTAC ATGTATAAAAGATGTACATCATATGTGCAAAAGTAGACATGTGTTAGAAAAAAT AAACAAACAAATACATAAAAAGAAAATCAAAGAAAAAACAACCCAAAAAACCA AAGAAAATAAAGAAGAAGAAGAAAAAGAGAAAAAACATTGAAAATCAAAGAAG AAAAAAACATAAAGAAAAGAAAACCGAAAAATACTGGCAAAAACACACAAAAA ATGAAAAGAAAAAATAAAGAAAACCGGACTTTACCAATCGAACGGAGCGATCGG ACACGAATGAGCGAAGGCATGCATCGAGCAACACCGCTAATTGACCGGCCCGTA GTCGTTCGCCCGTAGACCATTCATAAGAATCGGTATCGGAGAGACATAGGGGTTC TTTGGTTTCTAACCATATCTTGTCACACTTTACCATACATCACCTTAGTCAAATCT GATCAAATTAGGTGAGTATTTGGTTCTAGCCACATCTAAGGCAAGATTTGTTTTTC TGAGCAGTGAACCCCATATGTCATAGACAGAAAAATTGTGAAAAGATTCCTTTAG ACGGTCAAAGCGTGGTTAACAATTTAATCAACTCAAGTAAGATAAATGCGATAA ATGTGACAAAAATAATGTGTTATAGAAGTATGACAAAAATAATCACAATCCAAA CAGTCTGATAGCTTGGCGAGTGCAAAATAGATACGAAATCTCTGGTGATATCACA CGGGTCCAAAATAATTGCTTGTTTGAGCATCAGCCTTTCTGCACAAAAAAAGCTA GCCCAAACAAACGAGTGGCGTCCCATCTGAACCACACGCTCACCCGCCGCGTGA CAGCGCCAAAGACAAAACCATCACCCCTCCCCAATTCCAACCCTCTCTCCGCCTC ACAGAAATCTCTCCCCTCGCCCAAACCCTCGCCGCCGCCATGGCCGCCGCCACCT CCCCCGCCGTCGCATTCTCCGGCGCCGCCGCCGCCGCCGCCGCCATGCCCAAGCC CGCCCGCCAGCCTCTCCCGCGCCACCAGCCCGCCTCGCGCCGCGCGCTCCCCGCC CGCGTCGTCAGGTGCTGCGCCGCGCCCCCCGCTGCTGCCACCTCCGCCGCGCCCC CCGCCACCGCGCTCCGGCCCTGGGGCCCGTCCGAGCCCCGCAAGGGCGCCGACA TCCTCGTCGAGGCGCTCGAGCGCTGCGGCATCGTCGACGTATTCGCCTACCCCGG CGGCGCGTCCATGGAGATCCACCAGGCGCTGACGCGCTCGCCCGTCATCACCAAC CACCTCTTCCGCCACGAGCAGGGGGAGGCGTTCGCGGCGTCCGGCTACGCCCGCG CGTCCGGCCGCGTCGGCGTCTGCGTCGCCACCTCCGGCCCGGGGGCCACCAACCT CGTCTCCGCGCTCGCTGACGCCCTCCTCGACTCCATCCCCATGGTCGCCATCACGG GCCAGGTCCCCCGCCGCATGATCGGCACGGACGCGTTCCAGGAGACGCCCATAG TGGAGGTCACGCGCTCCATCACCAAGCACAACTACCTGGTCCTTGACGTGGAGGA TATCCCCCGCGTCATCCAGGAAGCCTTCTTCCTCGCGTCCTCTGGCCGCCCGGGGC CGGTGCTGGTTGATATCCCCAAGGATATCCAGCAGCAGATGGCCGTGCCTATCTG GGACACGCCGATGAGTTTGCCAGGGTACATCGTCCCGCCTGCCCAAGCCACCATC TACTGAATCGCTTGAGCAGGTCCTGCGTCTGGTTGGYGAGTCACGGCGCCCAATT CTGTATGTTGGTGGTGGCTGCGCTGCATCCGGCGAGGAGTTGCGCCGCTTTGTTG AGCTCACTGGGATTCCGGTTACAACTACTCTGATGGGCCTTGGCAACTTCCCCAG CGACGACCCACTGTCTCTGCGCATGCTTGGGATGCATGGCACTGTGTATGCAAAT TATGCAGTCGATAAGGCTGACCTGTTGCTTGCATTTGGTGTGCGGTTTGATGATCG CGTGACTGGGAAAATCGAGGCCTTTGCAAGCAGGTCCAAGATTGTGCACATTGAC ATTGACCCAGCTGAGATTGGCAAGAACAAGCAGCCACATGTCTCCATTTGTGCAG ATGTTAAGCTTGCTTTACAGGGGTTGAATGCTCTATTAAATGGGAGCAAAGCACA ACAGGGTCTGGATTTTGGTCCATGGCACAAGGAGTTGGATCAGCAGAAGAGGGA GTTTCCTCTAGGATTCAAGACTTTTGGCGAGGCCATCCCGCCGCAATATGCTATCC AGGTACTGGATGAGCTGACAAAAGGGGAGGCGATCATTGCTACTGGTGTTGGGC AGCACCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGC TGTCTTCGTCTGGTTTGGGGGCAATGGGATTTGGGTTACCAGCTGCAGCTGGCGC TGCTGTGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGAGATGGTAGTTTC CTCATGAACATTCAGGAGTTGGCATTGATCCGTATTGAGAACCTCCCTGTGAAGG TGATGATATTGAACAACCAGCATCTGGGAATGGTGGTGCAATGGGAGGATAGGT TTTACAAGGCCAATCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTG AGATATATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCGGCAGTTCG TGTGACGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCC AGGGCCATACTTGTTGGATATCATCGTCCCGCATCAGGAGCACGTGCTGCCTATG ATCCCAAGCGGTGGTGCTTTCAAGGACATGATCATGGAGGGTGATGGCAGGACC TCGTACTGAAATTTCGACCTACAAGACCTACAAGTGTGACATGCGCAATCAGCAT GGTGCCCGCGTGTTGTATCAACTACTAGGGGTTCAACTGTGAACCATGCGTTTTCT AGTTTGCTTGTTTCATTCATATAAGCTTGTGTTACTTAGTTCCGAACCCTGTAGCT TTGTAGTCTATGCTCTCTTTTGTAGGGATGTGCTGTCATAAGATATCATGCAAGTT TCTTGTCCTACATATCAATAATAAGTACTTCCATGGAATAATTCTCAGTTCTGTTT TGAATTTTGCATCTTCTCACAAACAGTGTGCTGGTTCCTTTCTGTTACTTTACATGT CTGCCGTGTCCGGTTATGACATAATGACCGATGGAGGGTGGTCAGCAGGTTTTAG ACGGGGAGTTGAAACTTTTTTTTGGGGGGAAGAAATCTGAATACAGTTGGGAGG AAAGATAAAAGCATATACCTTGATTAATTTATTGAGCCCAATATCCAGCCTAATT TATCAAGCAATAGGCAGTGTAGGGTGTTGGCATTCTTCTCTTCCTTGAGATCTGGT GTCGGGACCCCGATTCTAAGTCACACCGATCTAGCATGTAACACCTCATATCACT TTGCGGCCTCACGCACGGTATCCTCACGGGTGTCGCCTTACCATGGCCCGGGACC GTTTGCGCCTTTTGGCTCACGTATATGATGGTGTCGCTAGYATCCATATGACAGA GAACCCGGGCCGACATRGCTAGTCGTGAACCCAAAGCGGCACAGACCTATGGAG ACAGGCATACATGAATCACATCGAGCATGTCGGTCAACAGCGTATGAATCCGGG CTGTAGCACTGGGCTAACAGGACTCCGGGGAACCCGGGCTGTAGCAGGCTAGGC AGGACTCCGGAAGTCACCGCGTGACATTTCCCCGAAGGGACAGACATAGGAACG AAGTGGAACACATGCCGGCCAGTCAAGTGTTCTGAGCAGTAGTGCTGGGCTAGC AGGACTCCGGTGAACCGGGCTGTAGCGGACTACTATGGCTCGAGGTAGCACTAG ACTACATTTCCCCATAAGAGAGGCTKCCAAGGATAAGCAACTAGATTGTCGGRTC YCRSRYWTTGTCTCCGTGTGTTGTTATTGTTGTCATGCAAGTATGTGTTGTACAAC ATGGCATCACAACATAACGCAAACTCATATAGATATAGGCTCAGAGAGCCACAT AGCATTAATACGAACAGGGTCACATGACCCATCATTCAGAGCATACAGCATGAA GCATCATGTCTGAGTACAGACACTAC [0455] SEQ ED NO: 120

CTGAAAATTCAATATGGCCCTCGGGCACCAATGCTCTTGCTTCCAATTTTCATAAT TCCCATTTGTAAAAAACACACCACAAAAATCACACTGTAGTAATCTACATGTTTG TTGAGCCTATAAATCTTCATAAAATAATTGAGATTAATGCGGTTTGTGCAAAAAT ATGGGGTTGGTCATGTTTCTACATATTTCTATTTGCATTTCGTTAACTGGTGCTTGT TATTTTTGTACATAATGC ATATCTCATTGTTATTATTTTTAACCTTTTGAGATGGTA ACGAAGATCCAAACATGCATAGATGATTCTCCGGATGATTTTTTGTAGCCTGCAC TAGGAACTCCCAAGAGCCAGAAGGTTGGGTTTGTACAAGATAACATTTGTTTGAA CACACTCATAACCTGCATGTGACATACATGACGTAACTTATAGTGATGATTCGAC AAATGTCTCTTTGTCCAATTTTGTTATATATCCCGTGGCAACGCACGGGCATTCGA CTAGTATATGTAAAGATATCAATGTGACGAGTCCCCATGGTCGTTGCGCTTGTCC ACTACCGGCTCGCTAGAGGCGACTCTCACCTAGAAGTCGCTACGAGCAATACATA GTCGTTCTGGGCGCAGCTATGTTCTGCCTTTTGCGACGCTCAGGCACGGCTTGCCT ACAGCCTGAGGGTCGGGCTAGGAACCACTAATTGTGTCATGCTGATGTCACAATG ACATCATGCATATTTTTATTTTCGTTTTTCGCTTTCTCTTTAATTTTATTTGTATTTC AAAATATTTTATATATTTTTTGAATTTTTTCAATGTTGTATTTGAAAAATGTTAAA CCTGTATAGAGAAAAATATTTTTGATATATATAAAAGTATATAACATGAATGAAA AATGTATAAATGTTAATTATGTGTACCAAAAATGTTGATAACAATTAGCAGTCTC ACATATTTCAAAATAAATGTATGTGGAATTAAAAAATATGTGTATTTAAGTTTAA AAAAAATGTTCATGTAATGTTCGTAAAATGTTTGATACATTCAATAAAAATTATG TCACATTTGAATAATTCTTCTCAAGCTTAACAAATGCGCTCATTATATTATCAAAA ATTGTCTGTACAGTGTACACAAATGTTTATGTAGTTCAAAAAAAATGTTTTTTCAG TAAAAATATATTTGATCATGTATTTTATAAAAAACTGTTTAATATATATTTAGAAA ATATATTCAAAACATGTTTCTGTAAAAAGTTAAAACTATACATGTATAATGTAAG TCATTTATAATAAATGTTTTACATGTATAAAAAATGTACAACATATGTGCAAAAG TAGACATGTGTTGAAAAAATAAACAAATAACTAAATAAAAAGAAAATCAAAGAA AAACACCAAAAACCAAAGAAATAAATAAAACCAAAGTATAAAGAAGARRAAAG GAGAAAAAACATTGAAAATCAAAGARAAAAACATAAAGAAGAAAAAAACCGAA GAAAACTAGCAAAAAACACACACACAAAAAAGAAAATGAAAAGAAATAATAAA GAAAGCCGGACTGAACCGATCAAACGCAGCGATCGAACATGGATGAGCTAAGGC ATGCATCGAACAACACGGCTAATTGGCCGGCCCGTAGTCGTTCGCCCGTAGACCA TTCCTACGAATCGGTACCGGAGAGACATAGGGGCTGTATGGTTCCTAACCATACC TTGCCACACTTTGTCACACCTCATCTTAGGCAAATTTAATCAAGTTATGTAGGTGT TTGGTTTTAGCCACATCTAAGGCAAGATTTATTTTCCTGAGCAGTGAACCCCATAT GTTATAGACATAAAAAGTGTGGGAAGATTCCCTTTAGTCAAACTGTGGCTAACAA TTTATTAAGAATTAACTTAAGTAAGATAGGTGCAACAAATGTAGCAAAAATAATG TGGTATATATAGCAAAGATAGCCACAACCGCGAGTGGAAATACCAGATACGAGA TCTCTGGTCATATCACACGAGTCCAAATTAATTGCTTTGTTTGAGGTTCAGCCTTT TGCATAAAAAAGCTAGCCCAAACAAACGAGTGGCGTCCCATCTGAACCACACAC TCACCCGCCGCGTGACAGCGCCAAAGACAAAACCATCACCCCTCCCCAATTCCAA CCCTCTCTCTGCCTCACAGAAATCTCTCCCTCGCCCAAACCCTCGCCGCCGCCATG GCCGCAGCCACCTCCCCCGCCGTCGCATTCTCGGGCGCCGCCGCCGCCGCCGCCG CCATACCCAAACCCGCCCGCCAGCCTCTCCCGCGCCACCAGCCCGCCTCGCGCCG CGCGCTCCCCGCCCGCATCGTCAGGTGCTGCGCCGCGTCCCCCGCCGCCACCTCC GTCGCGCCTCCCGCCACCGCGCTCCGGCCGTGGGGCCCCTCCGAGCCCCGCAAGG GCGCCGACATCCTCGTCGAGGCGCTGGAGCGCTGCGGCATCGTCGACGTCTTCGC CTACCCTGGCGGCGCGTCCATGGAGATCCACCAGGCGCTGACGCGCTCGCCAGTC ATCACCAACCACCTCTTCCGCCACGAGCAGGGGGAGGCGTTCGCGGCGTCCGGGT ACGCCCGCGCGTCCGGCCGCGTCGGCGTCTGCGTCGCCACCTCCGGCCCGGGGGC CACCAACCTCGTCTCCGCGCTCGCCGACGCTCTCCTCGACTCCATCCCCATGGTCG CCATCACGGGCCAGGTCCCCCGCCGCATGATCGGCACGGATGCGTTCCAGGAGA CGCCCATCGTGGAGGTCACGCGCTCCATCACCAAGCACAACTACCTGGTCCTTGA CGTGGAGGATATCCCCCGCGTCATCCAGGAAGCCTTCTTCCTCGCATCCTCTGGC CGCCCGGGGCCGGTGCTGGTTGATATCCCCAAGGACATCCAGCAGCAGATGGCT GTGCCTGTCTGGGACACGCCGATGAGTTTGCCAGGGTACATCGCCCGCCTGCCCA AGCCACCATCTACTGAATCGCTTGAGCAGGTCCTGCGTCTGGTTGGCGAGTCACG GCGCCCAATTCTGTATGTTGGTGGTGGCTGCGCTGCATCTGGTGAGGAGTTGCGC CGCTTTGTTGAGCTCACTGGGATTCCAGTTACAACTACTCTTATGGGCCTTGGCAA CTTCCCCAGTGACGACCCACTGTCTCTGCGCATGCTGGGGATGCATGGCACTGTG TATGCAAATTATGCAGTAGATAAGGCTGACCTGTTGCTTGCATTTGGTGTGCGGT TTGATGATCGTGTGACCGGGAAAATCGAGGCTTTTGCAAGCAGGTCCAAGATTGT GCACATTGACATTGACCCAGCTGAGATTGGCAAGAACAAGCAGCCACATGTCTCC ATTTGTGCAGATGTTAAGCTTGCTTTACAGGGGTTGAATGCTCTATTAAATGGGA GCAAAGCACAACAGGGTCTGGATTTTGGTCCATGGCACAAGGAGTTGGATCAGC AGAAGAGGGAGTTTCCTCTAGGATTCAAGACTTTTGGTGAGGCCATCCCGCCGCA ATATGCTATCCAGGTACTGGATGAGCTGACAAAAGGGGAGGCGATCATTGCCAC CGGTGTTGGGCAGCATCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCA CGGCAGTGGCTGTCTTCATCCGGTTTGGGTGCAATGGGATTTGGGTTGCCAGCTG CAGCTGGCGCTGCTGTGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGGGA TGGTAGTTTCCTCATGAACATTCAGGAGTTGGCGTTGATCCGTATTGAGAACCTC CCAGTGAAGGTGATGATATTGAACAACCAGCATCTGGGAATGGTGGTGCAGTGG GAGGATAGGTTTTACAAGGCCAACCGGGCGCACACATACCTTGGCAACCCAGAA AATGAGAGTGAGATATATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTC CGGCAGTTCGTGTGACGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGC TTGAGACCCCAGGGCCATACTTGTTGGATATCATTGTCCCGCATCAGGAGCACGT GCTGCCTATGATCCCAAGCGGTGGTGCTTTTAAGGACATGATCATGGAGGGTGAT GGCAGGACCTCGTACTGAAATTTCGACCTACAAGACCTACAAGTGTGACATGCGC AATCAGCATGATACCTGCGTGTTGTATCAACTACTGGGGGTTCAACTGTGAACCA TGCGTTTTCTAGTTTGCTTGTTTCATTCATATAAGCTTGTGTTACTTAGTTCCGAAC CGTGTAGTTTTGTAGTCTCTGTTCTCTTTTGTAGGGATGTGCTGTCATAAGATATC ATGCAAGTTTCTTGTCCTACATATCAATAATAAGCACTTCCATGGAATAATTCTCA GTTCTGTTTTGAATTTCACATCTTCTCACGAACAGTGTGCTGGTTCCTTTCTGTTAC TTTACATGCCTGCCGTGTCAGGTTATGACATAACGACCGATGGAGGATTGGAGGG TGGTCGGCTGGTTTTAGACGGGGAATTGAAACATTTTTCTGGAAGAAATCTGAAT ACAGTTGGGAGGGGAAATGGAAGCATATATTTATCGAGCCCGCTATCC AGGCTA ATTTATCAAGCACTAGACAGTGTAGGGTGTTGGCATTCTTCTCTTCCTTGATATCC GGCTTGAGAGGAGAGATTGAGGCTTCGGCTGTGTTGGTTGCTGATTTCTACAGCA TTTTGAGAGAGAGAGAGAGATGTTGCAACTGTGTTTTGTCTTGGTTGCTTGTACA GAGAAAGAGATGACATTTAGAGATATGCAGATCGTTTACCAGTTGTGCTGCGTTT ATTCGTACTGATTGTTGTTATTGTTGCTATCATGTGCAAATTGTTGTGATGGAAAA TCAACAAAATTTTGATATTTTGCAAAGCGAGTTGGATTGAATGATTTGAGAAATG GTGACTTGTTGAGTGGCCTTGAGAATTGGTGTTTCATAGGTGTGCAGTTGGTAAT GAAAGGCGGCGGCTTGAAATTTCCGAAAGGCAGGCAATGATACTTTCTGAAAGT CAGCTTATTCTGCTCTTTAGTTTCAGTTGTTTTGTTCACAGATTGCTGGGCAGAGC CCCATGATCGGCTGAGCCTCCAGGAGATCCTTGATTGCTCGACTGCGGATACGTT GAATCCTTTAAAATACTATAAGCTCCCTAGTTTTAGTTTTAGAGAACTGAGAATC AATTGAGGGCAACATTAGTCGATTTTGGCTTCCGATTTTGACTGGGTCGCCTCCCT GGGTCCTCTACAGTTTTGTGGGCCCTATATGTAAGTGCCCCAGTGTTGTGGGCTTT CTGGTCTTTTCTGATGAAAGCGGCGTGGTGGCTGGGGGCTTTAGAATATTTCATT GATTAACTAAAACAAATCAGATCCCTTTTTCCTGCTTCATGTGTGTTTGACCAATC TTTTTTTAAAAATTTCTTTGATTTTATATTTGATGGAGTAAATCTGGCTGTGTCAA CGGTAGTCCATTCGAAACCTGGAAATCGAAATCATTGTACTGCAGGTCTGTTGCC TGTTAGTTTGTTCTTATATAAGATCTTTGACAGTTTATGAATTTGTCTTTGGAATTT GTATAAAGTTTCACAGATAGACAGGCCCTGTTGTTAAATACGTTCGTGCAATTAA GTGTAAACATATCTGCCAGTGATTTTTCTCGGCTCGCATTAGTACGCATAAATTTT TAGCACTTCTCTGAATTTTCTCATATGCAGACCACCTATGAAAAAAACGACATGC AAGTAAATAAAACGATTTCAGGTTCATTTAGTAGCAAACCGTTTTTATGTCCTTTA AAAATCAATTAGCAGAGCCACTCCATTCACCGGTCAGCAGAAAAGAAGCATGTG TGTGTTTTTGGGCTATCATAGAGCTAAATAAATTTGATTCCCATCTGTAATGTTCA TCGTTGTTTACATCAGTGTTGGCTGTCGTGTGGTCGTGGAGACTAGCCTGTTCAGA CAATATGTTTGACAAGAGTGTTGTTTTGTGAGATGCGGATGCGGTGCTTGCATCT GTACTTGTTTTTGTGAATACCAGTTAGATGATCAGTTTTTGTGCACTTCTTGCCAT GAATGGCTGTTAAATTGTCACTTTTTAGGAACTTGTTGCCGTAATATCAATTAAAT AATCAATTTTTGTGCATGGTATATCAATTAGATGGTCATTTTTTTCTAGTAGAGAT GTCTATACATGCCAATGCAATGTTCAGAGTTGTTCAAGGTCTCGACGGCGCGGCA AAGCGCGTCCTATGCTTCTAGTTTAAGATGACAACCAAACACGACCCAAGTGTAT GCTATGCTCATCCGGTTGGTCCTTGTTGATGTTCAATGGGCGTGTCTCCATGGGCA TCGACGGCGACAATGTTATCTTCTTCAACTGTCTGCTATATGCTCATTGGCATTTT TGAAACTTTGCAAGCAAGGTCGATAACTTGGTCTGGGGATGTTGACGCCCCTATG TATCTAGATTAGGGTGATGCTCCCGCCAGTATTTTTTGGACGATTATCAACATTTG CGGCTGGTATACTATTGTGGCTAATCAACAAGGTTTTTTTGTGTGTGGCTAATCAA CAAGGTTTGGCGCTCGATGTTTTTTTAATGTATTTCGATGACTCAATTTCTACGTC TGAACATTTCATTGAGCCAAGAGGCAGAACAACAGGTCACATGTAACCGCCAGT GAAAAAGGTTCAAAGAAGAAAAAGATACGAACGACAGCGAGTTTGTAT KCAGT TTTCGAACTAAGAGTAACACGGAGTRCAGTAGTACGATCCTTGTGTMYTTCTGTA TTTGGWTAKTTTTTTTCCGGAGTTGAGTATTWGWAACTTTCTTGTGCTTTTTTTAA CATTAGTACAGATGCAAGTGCTCATACATACGCGCTTTTTGATTTGTAACAATATT ATGAAAGACGTAGTAATTATGTTTGCAGATCAATAAAGCTAGCCATCGTGTGGTG TTCCCAAGAAAAAGATATTCACTATAGATTCACTACATCTTCTAAAAAAACTACA CTGTAGATTCACTACAGACCAACAGAATATTCATGGTCACGTGGATAAAAACTTA CTTTTTGAAAGTCTCAAGCATTTGGTTTGATTTTAAGAAAAAATAACTGACTCTAT TTTTGTGTACTCCTTGCAACGAACCTGGATAAAGATGGAGCCAGTCCGTTCCTGG TTACTAGGAGTATCCATTTCCTGAAGACCATGGAGCAACC ACGGCGGATCGGGCG ATCGGCAGCCTCCCAGCCGGCGACCATGGCGGATGCCACGAGCGCAGGAGCGAC GCCTCTCCTCCCTGGCCTCCTCGACGACATCGTAATCTGTGAGATCCTTGTCCGCC TCGCCCCCCAAAGCCATCCTCCGCTGCCGCGCCGTCACGCCGTGCCTGGCGCCGC ACCACCTCCACCCGCGACTTCCTCCTCGCCCACCACGCCCGCCAGCCCGCCCTCCT CATCACCTCCGGCCACAGTT

[0456] SEQ ID O: 121

AAATTTTTATAATATTGTTTTTCCAAATTTTATGTTTAAACTCATTTTTGTTCAATT TTTTGTGAATATATTTTAATCCATTGATAGATTTTGAAAATATAATAATTTTTCCA AAACATTCTATAATTTCATAAACCTTTTTAACATTTCAAGAATAAGATTAGGAAA TTTTGATTCTTAAAATATATTTTTAATCTTGCAACTACATTTTTATATACAATTACA TGAGCCAATTTATTTTGGTAGAAATCAACTGAAAAAACAAAAGAAAAAATTGGA ATAGCGGGAGTTCTCTGCGCGAACTTGGGGGGGGGGGGCGACAACCCTCTATCA ATGAGCTAGGGATTCCTATTACATCTCGCCTACAAGCCGCACTAGTTTTTTYCCCA TTTGTTTTATATCGGTTTTTTACTACTTTTGCACCGGTTTTCTTCTGGTATTATTTCA TTTTTCTTCTATACTTTCTGTTGTTTTCTTCGTTTCCCCCTCCTGTTTTTTTGTCTTTT TCTACAGTTTCCTTGTTTCTTTCTTTGGTTTTCACCGATTTACTTTGTTTTTCACGTT TTTAAATTTTAATTTTAATCTTCAGATACATAATTAACATTCATTAAATTATATAC TTTTATGTCAAGTTTTTTCATACACATTGTGCATTTTATACATATTAGGATTCTTAA ATACATGATTAATATTTTATTCAGACATAGAGTACTTGTTTTGAACACTTTTTCAA ATACATGTTGAAATAATTTATTTTATGATATGAAATATGTTTTTTTATTATGCAAA CATTTTTATACACTTTATGTTTTTTTGAAATATTACAAAATTTTTGCTTGAAACGTG TGAACATTTTTTAAAATGTAACATAATTTTTTGAATGGTATGAAACTTTTTTGAAC TGCGCGAACATTATTTTTACATTGTATATTATTTTGATTCATTTTCTGTAAGTTATC GCCTGAATTGCTTGAAAAACGTGATTTTTTTTAAATGCCACATATATTGTTTTTGA ATGGTTCATGCATTTTCTGAAAGTTGATCGAACATGTTTTTATATTGCATTTTTAA AATGTAATAACCACTTTTGAAAATTAACTAATGTATTTTCATAATATATGTATTTA ATATTATTAAAAATAAAAAAAAGGTAAAAGAAAAAACAGATCAACGCGATGAG ACCCCATGGTTGTTGCGCTTGTCCACTACCGGCTCACTGAAGACGTCTCTCACAGT AGGAGTCGCTACGAAGAATACATAGTCGCGCTGGGCGCGGTTATGTTCCGCCTGT TGCGACGCCCAAGCATGGCTTGCCTACAGCTAGAGGGTCGGGCTAGGAACCACT AATTGTGTCATGCTGATGTCACAATGACATCATACATGCTTTTATTTTAATTTTTC GCTTTCTCTTTAAATTTTTTTGTATTTCAAAATATTCTGTTTTTTTAAGAATGCTAG TATTGTATTTGAAAAATGTTAAACCTGTATAGAAAAATATATAACATGAATGAAA AATGTATAGATGTTAATCATGTGTACAAAAAATGATTGTGACAATTAAGAATGTC ACATATTTCAAAATAAATGTATGTGGAATTTTGAAAAAATGTGTATATAATTTTTT AATGGTCATGTAATTTTAAAAAAATGTGTGATACATTCAACAAAAAATATTTCAC ATTTGAATAATTCTTCTTGAGCTTAAGAAATGTGTTCATTATGTTATCAATTTTTTT GTACAGTGTACAAAAATGTTTACATAGTTCAAAAAAATGTTTTTCAGTAAAATTA CATTTCATTGTGTATTTAATATTTTAACACACATTTGGAAAATATATTTGAAACAT GTTTTTGTAAAAAAAAATTTAAAACTATGCTTGTACTCCCTCCGTCCGAAAAAGG TTTACATGTATAAAAGTTTTTTCGGAGGGAGGGATTATAATGTTAGTCATTTATAA GAAATGTTTTACATGTATGAAAATGTATAGCATATGTGTAAAAGTAGACATGTGT TGAAAAAAAAAAGTAAAACAACCCAAAAAACCAATGAAAATAAAATAAAACCA AAGTACCAAGAAGAAGAAAAGGAGAATAAACCATTGAAAAACAAAGAAAATAA AAAACATAAAGAAGAAAGAAACCCAAAGAAAACTGGCAAAAATTAGACACAGA AAAGAAAAACGAAAAAATATATAATAAARAAAACCGGACTGAACCGATCGGAC ACGGATGAGCGAAGGCATGCATCGAGCAACACAGCTAATTGGCCGGCCCATAGT CGTTCGCCCGCAGACCATTCATACGAATCGGTACCGGAGAGACATAGGGGCTATT TGGTTTGTAGCCACATTTTGTCATACTTTGTGACACCGCATCTTATGCAAGTTTGA CCAAATTAGGTGGATGTTTAGTTCTAACCACATGTAAGGGAAGATTTTTTTTTATG AGCATTGAACCCGTAGACACAAAAAGTGTAGGAAGATTACTTTAAACAAGCTAA AGTGTGGCTAACAATTTAAGCATCTCAGGTAAGATAAGTGCGACAAATATGGCA AAAATAATGTGGTATATATGACAAAGATAGTCACAATCCAAACAGCCCATAGCC TGGCGAGTGCAAATAGATACGAGATCTCTGGTGATATCACAACCGTCCAAATTAA TTGCTTGTTTCAGCATCAGCCTTTTTGCATAAAGAAGCTAGCCCAATCTGAACCAC ACACTCACCCGCCGCGTGACAGCGCCAAAGACAAAAACATCACCCCTCCCCAATT CCAACCCTCTCTCTGCCTCACAGAAATCTCCCCCCTCGCCCAAACCCTCGCCGCCG CCATGGCCGCCGCCACCTCCCCCGCCGTCGCATTCTCGGGCGCCACCGCCGCCGC CATGCCCAAACCCGCCCGCCATCCTCTCCCGCGCCACCAGCCCGTCTCGCGCCGC GCGCTCCCCGCCCGCGTCGTCAGGTGTTGCGCCGCGTCCCCCGCCGCCACCTCCG CCGCGCCTCCCGCAACCGCGCTCCGGCCCTGGGGCCCGTCCGAGCCCCGCAAGGG CGCCGACATCCTCGTCGAGGCGCTCGAGCGCTGCGGCATCGTCGACGTCTTCGCC TACCCCGGCGGCGCCTCCATGGAGATCCACCAGGCGCTGACGCGCTCGCCCGTCA TCACCAACCACCTCTTCCGCCACGAGCAGGGGGAGGCGTTCGCGGCGTCCGGCTA CGCCCGCGCGTCCGGCCGCGTCGGCGTCTGCGTCGCCACCTCCGGCCCGGGGGCC ACCAACCTCGTCTCCGCGCTCGCCGACGCCCTCCTCGACTCCATCCCCATGGTCGC CATCACGGGCCAGGTCCCCCGCCGCATGATCGGCACGGACGCGTTCCAGGAGAC GCCCATAGTGGAGGTCACGCGCTCCATCACCAAGCACAACTACCTGGTCCTTGAC GTGGAGGATATCCCCCGCGTCATCCAGGAAGCCTTCTTCCTTGCATCCTCTGGCC GCCCGGGGCCGGTGCTAGTTGATATCCCCAAGGACATCCAGCAGCAGATGGCTGT GCCCGTCTGGGACACTCCAATGAGTTTGCCAGGGTACATCGCCCGCCTGCCCAAG CCACCATCTACTGAATCGCTTGAGCAGGTCCTGCGTCTGGTTGGCGAGTCACGGC GCCCAATTCTGTATGTTGGTGGTGGCTGCGCTGCGTCTGGCGAGGAGTTGCGCCG CTTTGTTGAGCTTACTGGGATTCCAGTTACAACTACTCTGATGGGCCTTGGCAACT TCCCCAGCGACGACCCACTGTCTCTGCGCATGCTTGGGATGCATGGCACTGTGTA TGCAAATTATGCAGTAGATAAGGCTGACCTGTTGCTCGCATTTGGTGTGCGGTTT GATGATCGTGTGACTGGGAAAATCGAGGCTTTTGCAAGCAGGTCCAAGATTGTGC ACATTGACATTGACCCAGCTGAGATTGGCAAGAACAAGCAGCCACATGTCTCCAT TTGTGCAGATGTTAAGCTTGCTTTACAGGGGTTGAATGATCTATTAAATGGGAGC AAAGCACAACAGGGTCTGGATTTTGGTCCATGGCACAAGGAGTTGGATCAGCAG AAGAGGGAGTTTCCTCTAGGATTCAAGACTTTTGGCGAGGCCATCCCGCCGCAAT ATGCTATCCAGGTACTGGATGAGCTGACAAAAGGGGAGGCGATCATTGCCACTG GTGTTGGGCAGCACCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACG GCAGTGGCTGTCTTCGTCTGGTTTGGGGGCAATGGGATTTGGGTTACCAGCTGCA GCTGGCGCTGCTGTGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGTGATG GTAGTTTCCTCATGAACATTCAGGAGTTGGCGTTGATCCGCATTGAGAACCTCCC AGTGAAGGTGATGATATTGAACAACCAGCATCTGGGAATGGTGGTGCAGTGGGA GGATAGGTTTTACAAGGCCAATCGGGCGCACACATACCTTGGCAACCCAGAAAA TGAGAGTGAGATATATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCA GCAGTTCGAGTGACGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTT GAGACCCCAGGGCCATACTTGTTGGATATCATAGTCCCGCATCAGGAGCACGTGC TGCCTATGATCCCAAGCGGTGGTGCTTTCAAGGACATGATCATGGAGGGTGATGG CAGGACCTCGTACTGAAATTTCGACCTACAAGACCTACAAGTGTGACATGCGCAA TCAGCATGATGCCCGCGTGTTGTATCAACTACTAGGGGTTCAACTGTGAGCCATG CGTTTTCTAGTTTGCTTGTTTCATTCATATAAGCTTGTATTACTTAGTTCCGAACCC TGTAGTTTTGTAGTCTATGTTCTCTTTTGTAGGGATGTGCTGTCATAAGATGTCAT GCAAGTTTCTTGTCCTACATATCAATAATAAGTACTTCCATGGAATAATTCTCAGT TCTGTTTTGAATTTTGCATCTTCTCACAAACAGTGTGCTGGTTCCTTTCTGTTACTT TACATGTCTGCTGTGTCAGGTTCTGACATAACGACCGATGGAGGGTGGTCGGCAG GTTTTAGAAGGGGAATTGAAACTTTTTTTTGGGAAGAAGTCTGAATACAGTTGGG AGGAAAAATAGAAGTATATACTTCGATTAATTTATCAAGCCCGCTATCCAGTCTA ATTTATCAAGCACTAGACAGTGTAGGGTGTTGGCATTCTTCTCTTCCTTGAGATCC GGCTTGAGAGGAGAGACCGAGGCTTCGGCTGTGTTGGTTGCTGATTTCTACAGCT TTTTGAGATAGAGAGAGAGATCCTGCAACTGTGGTTTGTCTTGCTGCTTGTACAG CGAGAGAGACATTGAGAGATATGTAGATCGTTTACCAGTTGTGCTGCTGTTATTC GTACTGGTACTGATTGTTGTTACTGTTGCTATCATGTGCAAATTGTTGTGATGGAA AATCAACAAAATTTTGATATTTTGCAAAGCGAGTTGGATTGAATGATTTGAGAAA TGGTGACTGCTTTCCCTCAGACTTGTTGAGTGGCCTTGAGAATTGGTGTTTCATAG GTGGTGTATGCAGTTGCTAATGAAAGGCGACGGCTTGAAATTTCCGAAAGGCAG CCAATGATACTTTCTGAAAGTGATGTTTTTTTCGTCCAGGTTTCCGGTGGAGCAAG TCTAGACACACGTTGAGCCAATGTTTGTCAGCTTATTCTGCTCTTTAGTTTCAGTT TAGGTGCAGTTGTTTTGTTTACAGATTGCTGGGCAGAGCCCCGTGATCGGCTGAG CCTCCAAGAGATCCTTGCTTGCTCGACTGCGGATACGCTGAATCCTTTAAAACGC TCCCTAGTTTTAAGTTTTAGAGAACTGAGAATCAATTGGGGGCAACATTACTGGG TCGCCTCCCTGGGCCTCTACAGTTTTGTGGGCCCTATATGTAAGTGCCCCAGTGTT GTGGGGATTTGCGGCGTGGCGGGCGGCATTTGCGTCCTCTCTTCGGCGGCGCTGT TTCCCCCTCCTTCTTGCTGCTTCTGGAGGAGGTGGTCGGCGGCGGGTGTTGTGGG GGGTCGCATTGGAGCGGCGCGAACGCCGGTCCTGCTGCATCTGCCGCCATTGGTT GTT

[0457] SEQ ID NO:140

CGTTCGCCCGTAGACCATTCATAAGAATCGGTATCGGAGAGACATAGGGGTTCTT TGGTTTCTAACCATATCTTGTCACACTTTACCATACATCACCTTAGTCAAATCTGA TCAAATTAGGTGAGTATTTGGTTCTAGCCACATCTAAGGCAAGATTTGTTTTTCTG AGCAGTGAACCCCATATGTCATAGACAGAAAAATTGTGAAAAGATTCCTTTAGAC GGTCAAAGCGTGGTTAACAATTTAATCAACTCAAGTAAGATAAATGCGATAAAT GTGACAAAAATAATGTGTTATAGAAGTATGACAAAAATAATCACAATCCAAACA GTCTGATAGCTTGGCGAGTGCAAAATAGATACGAAATCTCTGGTGATATCACACG GGTCCAAAATAATTGCTTGTTTGAGCATCAGCCTTTCTGCACAAAAAAAGCTAGC CCAAACAAACGAGTGGCGTCCCATCTGAACCACACGCTCACCCGCCGCGTGACA GCGCCAAAGACAAAACCATCACCCCTCCCCAATTCCAACCCTCTCTCCGCCTCAC AGAAATCTCTCCCCTCGCCCAAACCCTCGCCGCCGCCATGGCCGCCGCCACCTCC CCCGCCGTCGCATTCTCCGGCGCCGCCGCCGCCGCCGCCGCCATGCCCAAGCCCG CCCGCCAGCCTCTCCCGCGCCACCAGCCCGCCTCGCGCCGCGCGCTCCCCGCCCG CGTCGTCAGGTGCTGCGCCGCGCCCCCCGCTGCTGCCACCTCCGCCGCGCCCCCC GCCACCGCGCTCCGGCCCTCGGGGCCCGTCCGAGCCCCGCAAGGGCGCCGACAT CCTCGTCGAGGCGCTCGAGCGCTGCGGCATCGTCGACGTATTCGCCTACCCCGGC GGCGCGTCCATGGAGATCCACCAGGCGCTGACGCGCTCGCCCGTCATCACCAACC ACCTCCTTCCGCCACGAGCGAGGGGGAGGCGTTCGCGGCGTCCGGCTACGCCCGC GCGTCCGGCCGCGTCGGCGTCTGCGTCGCCACCTCCGGCCCGGGGGCCACCAACC TCGTCTCCGCGCTCGCTGACGCCCTCCTCGACTCCATCCCCATGGTCGCCATCACG GGCCAGGTCCCCCGCCGCATGATCGGCACGGACGCGTTCCAGGAGACGCCCATA GTGGAGGTCACGCGCTCCATCACCAAGCACAACTACCTGGTCCTTGACGTGGAGG ATATCCCCCGCGTCATCCAGGAAGCCTTCTTCCTCGCGTCCTCTGGCCGCCCGGG GCCGGTGCTGGTTGATATCCCCAAGGATATCCAGCAGCAGATGGCCGTGCCTATC TGGGACACGCCGATGAGTTTGCCAGGGTACATCGTCCCGCCTGCCCAAGCCACCA TCTACTGAATCGCTTGAGCAGGTCCTGCGTCTGGTTGGCGAGTCACGGCGCCCAA TTCTGTATGTTGGTGGTGGCTGCGCTGCATCCGGCGAGGAGTTGCGCCGCTTTGTT GAGCTCACTGGGATTCCGGTTACAACTACTCTGATGGGCCTTGGCAACTTCCCCA GCGACGACCCACTGTCTCTGCGCATGCTTGGGATGCATGGCACTGTGTATGCAAA TTATGCAGTCGATAAGGCTGACCTGTTGCTTGCATTTGGTGTGCGGTTTGATGATC GCGTGACTGGGAAAATCGAGGCCTTTGCAAGCAGGTCCAAGATTGTGCACATTG ACATTGACCCAGCTGAGATTGGCAAGAACAAGCAGCCACATGTCTCCATTTGTGC AGATGTTAAGCTTGCTTTACAGGGGTTGAATGCTCTATTAAATGGGAGCAAAGCA CAACAGGGTCTGGATTTTGGTCCATGGCACAAGGAGTTGGATCAGCAGAAGAGG GAGTTTCCTCTAGGATTCAAGACTTTTGGCGAGGCCATCCCGCCGCAATATGCTA TCCAGGTACTGGATGAGCTGACAAAAGGGGAGGCGATCATTGCTACTGGTGTTG GGCAGCACCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGT GGCTGTCTTCGTCTGGTTGGGGGCAATGGGATTTGGGTTACCAGCTGCAGCTGGC GCTGCTGTGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGAGATGGTAGTT TCCTCATGAACATTCAGGAGTTGGCATTGATCCGTATTGAGAACCTCCCTGTGAA GGTGATGATATTGAACAACCAGCATCTGGGAATGGTGGTGCAATGGGAGGATAG GTTTTACAAGGCCAATCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAG TGAGATATATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCGGCAGTT CGTGTGACGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACC CCAGGGCCATACTTGTTGGATATCATCGTCCCGCATCAGGAGCACGTGCTGCCTA TGATCCCAAGCGGTGGTGCTTTCAAGGACATGATCATGGAGGGTGATGGCAGGA CCTCGTACTGAAATTTCGACCTACAAGACCTACAAGTGTGACATGCGCAATCAGC ATGGTGCCCGCGTGTTGTATCAACTACTAGGGGTTCAACTGTGAACCATGCGTTT TCTAGTTTGCTTGTTTCATTCATATAAGCTTGTGTTACTTAGTTCCGAACCCTGTA GCTTTGTAGTCTATGCTCTCTTTTGTAGGGATGTGCTGTCATAAGATATCATGCAA GTTTCTTGTCCTACATATCAATAATAAGTACTTCCATGGAATAATTCTCAGTTCTG TTTTGAATTTTGCATCTTCTCACAAACAGTGTGCTGGTTCCTTTCTGTTACTTTACA TGTCTGCCGTGTCCGGTTATGACATAATGACCGATGGAGGGTGGTCAGCAGGTTT TAGACGGGGAGTTGAAACTTTTTTTTGGGGGGAAGAAATCTGAATACAGTTGGGA GGAAAGATAAAAGCATATACCTTGATTAATTTATTGAGCCCAATATCCAGCCTAA TTTATCAAGCAATAGGCAGTGTAGGGTGTTG

[0458] SEQ ID NO:141

CGTTCGCCCGTAGACCATTCCTACGAATCGGTACCGGAGAGACATAGGGGCTGTA TGGTTCCTAACCATACCTTGCCACACTTTGTCACACCTCATCTTAGGCAAATTTAA TCAAGTTATGTAGGTGTTTGGTTTTAGCCACATCTAAGGCAAGATTTATTTTCCTG AGCAGTGAACCCCATATGTTATAGACATAAAAAGTGTGGGAAGATTCCCTTTAGT CAAACTGTGGCTAACAATTTATTAAGAATTAACTTAAGTAAGATAGGTGCAACAA ATGTAGCAAAAATAATGTGGTATATATAGCAAAGATAGCCACAACCGCGAGTGG AAATACCAGATACGAGATCTCTGGTCATATCACACGAGTCCAAATTAATTGCTTT GTTTGAGGTTCAGCCTTTTTGCATAAAAAAGCTAGCCCAAACAAACGAGTGGCGT CCCATCTGAACCACACACTCACCCGCCGCGTGACAGCGCCAAAGACAAAACCAT CACCCCTCCCCAATTCCAACCCTCTCTCTGCCTCACAGAAATCTCTCCCTCGCCCA AACCCTCGCCGCCGCCATGGCCGCAGCCACCTCCCCCGCCGTCGCATTCTCGGGC GCCGCCGCCGCCGCCGCCGCCATACCCAAACCCGCCCGCCAGCCTCTCCCGCGCC ACCAGCCCGCCTCGCGCCGCGCGCTCCCCGCCCGCATCGTCAGGTGCTGCGCCGC GTCCCCCGCCGCCACCTCCGTCGCGCCTCCCGCCACCGCGCTCCGGCCGTGGGGC CCCTCCGAGCCCCGCAAGGGCGCCGACATCCTCGTCGAGGCGCTGGAGCGCTGC GGCATCGTCGACGTCTTCGCCTACCCTGGCGGCGCGTCCATGGAGATCCACCAGG CGCTGACGCGCTCGCCAGTCATCACCAACCACCTCTTCCGCCACGAGCAGGGGGA GGCGTTCGCGGCGTCCGGGTACGCCCGCGCGTCCGGCCGCGTCGGCGTCTGCGTC GCCACCTCCGGCCCGGGGGCCACCAACCTCGTCTCCGCGCTCGCCGACGCTCTCC TCGACTCCATCCCCATGGTCGCCATCACGGGCCAGGTCCCCCGCCGCATGATCGG CACGGATGCGTTCCAGGAGACGCCCATCGTGGAGGTCACGCGCTCCATCACCAA GCACAACTACCTGGTCCTTGACGTGGAGGATATCCCCCGCGTCATCCAGGAAGCC TTCTTCCTCGCATCCTCTGGCCGCCCGGGGCCGGTGCTGGTTGATATCCCCAAGG ACATCCAGCAGCAGATGGCTGTGCCTGTCTGGGACACGCCGATGAGTTTGCCAGG GTACATCGCCCGCCTGCCCAAGCCACCATCTACTGAATCGCTTGAGCAGGTCCTG CGTCTGGTTGGCGAGTCACGGCGCCCAATTCTGTATGTTGGTGGTGGCTGCGCTG CATCTGGTGAGGAGTTGCGCCGCTTTGTTGAGCTCACTGGGATTCCAGTTACAAC TACTCTTATGGGCCTTGGCAACTTCCCCAGTGACGACCCACTGTCTCTGCGCATGC TGGGGATGCATGGCACTGTGTATGCAAATTATGCAGTAGATAAGGCTGACCTGTT GCTTGCATTTGGTGTGCGGTTTGATGATCGTGTGACCGGGAAAATCGAGGCTTTT GCAAGCAGGTCCAAGATTGTGCACATTGACATTGACCCAGCTGAGATTGGCAAG AACAAGCAGCCACATGTCTCCATTTGTGCAGATGTTAAGCTTGCTTTACAGGGGT TGAATGCTCTATTAAATGGGAGCAAAGCACAACAGGGTCTGGATTTTGGTCCATG GCACAAGGAGTTGGATCAGCAGAAGAGGGAGTTTCCTCTAGGATTCAAGACTTTT GGTGAGGCCATCCCGCCGCAATATGCTATCCAGGTACTGGATGAGCTGACAAAA GGGGAGGCGATCATTGCCACCGGTGTTGGGCAGCATCAGATGTGGGCGGCTCAG TATTACACTTACAAGCGGCCACGGCAGTGGCTGTCTTCATCCGGTTTGGGTGCAA TGGGATTTGGGTTGCCAGCTGCAGCTGGCGCTGCTGTGGCCAACCCAGGTGTTAC AGTTGTTGACATTGATGGGGATGGTAGTTTCCTCATGAACATTCAGGAGTTGGCG TTGATCCGTATTGAGAACCTCCCAGTGAAGGTGATGATATTGAACAACCAGCATC TGGGAATGGTGGTGCAGTGGGAGGATAGGTTTTACAAGGCCAACCGGGCGCACA CATACCTTGGCAACCCAGAAAATGAGAGTGAGATATATCCAGATTTTGTGACGAT TGCTAAAGGATTCAACGTTCCGGCAGTTCGTGTGACGAAGAAGAGCGAAGTCAC TGCAGCAATCAAGAAGATGCTTGAGACCCCAGGGCCATACTTGTTGGATATCATT GTCCCGCATCAGGAGCACGTGCTGCCTATGATCCCAAGCGGTGGTGCTTTTAAGG ACATGATCATGGAGGGTGATGGCAGGACCTCGTACTGAAATTTCGACCTACAAG ACCTACAAGTGTGACATGCGCAATCAGCATGATACCTGCGTGTTGTATCAACTAC TGGGGGTTCAACTGTGAACCATGCGTTTTCTAGTTTGCTTGTTTCATTCATATAAG CTTGTGTTACTTAGTTCCGAACCGTGTAGTTTTGTAGTCTCTGTTCTCTTTTGTAGG GATGTGCTGTCATAAGATATCATGCAAGTTTCTTGTCCTACATATCAATAATAAG CACTTCCATGGAATAATTCTCAGTTCTGTTTTGAATTTCACATCTTCTCACGAACA GTGTGCTGGTTCCTTTCTGTTACTTTACATGCCTGCCGTGTCAGGTTATGACATAA CGACCGATGGAGGATTGGAGGGTGGTCGGCTGGTTTTAGACGGGGAATTGAAAC ATTTTTCTGGAAGAAATCTGAATACAGTTGGGAGGGGAAATGGAAGCATATATTT ATCGAGCCCGCTATCCAGGCTAATTTATCAAGCACTAGACAGTGTAGGGTGTTGG CATTCTTCTCTTCCTTGATATCCGGCTTGAGAGGAGAGATTGAGGCTTCGGCTGTG TTGGTTGCTGATTTCTACAGCATTTTGAGAGAGAGAGAGAGATGTTGCAACTGTG TTTTGTCTTGGTTGCTTGTACAGAGAAAGAGATGACATTTAGAGATATGCAGATC GTTTACCAGTTGTGCTGCGTTTATTCGTACTGATTGTTGTTATTGTTGCTATCATGT GCAAATTGTTGTGATGGAAAATCAACAAAATTTTGATATTTTGCAAAGCGAGTTG GATTGAATGATTTGAGAAATGGTGACTTGTTGAGTGGCCTTGAGAATTGGTGTTT CATAGGTGTGCAGTTGGTAATGAAAGGCGGCGGCTTGAAATTTCCGAAAGGCAG GCAATGATACTTTCTGAAAGTGATGTTTTTTCTTCCAGGTTTCCGGTGGAACAAGT CTACGTTGAGCCAATGTTTGTCAGCTTATTCTGCTCTTTAGTTTCAGTTGTTTTGTT CACAGATTGCTGGGCAGAGCCCCATGATCGGCTGAGCCTCCAGGAGATCCTTGAT TGCTCGACTGC

[0459] SEQ ED NO:142

CGTTCGCCCGTAGACCATTCATACGAATCGGTACCGGAGAGACATAGGGGCTATT TGGTTTGTAGCCACATTTTGTCATACTTTGTGACACCGCATCTTATGCAAGTTTGA TCAAATTAGGTGGATGTTTAGTTCTAACCACATGTAAGGGAAGATTTTTTTTTTTA TGAGCATTGAACCCGTAGACACAAAAAGTGTAGGAAGATTACTTTAAACAAGCT AAAGTGTGGCTAACAATTTAAGCATCTCAGGTAAGATAAGTGCGACAAATATGG CAAAAATAATGTGGTATATATGACAAAGATAGTCACAATCCAAACAGCCCATAG CCTGGCGAGTGCAAATAGATACGAGATCTCTGGTGATATCACAACCGTCCAAATT AATTGCTTGTTTCAGCATCAGCCTTTTTGCATAAAGAAGCTAGCCCAATCTGAAC CACACACTCACCCGCCGCGTGACAGCGCCAAAGACAAAACCATCACCCCTCCCC AATTCCAACCCTCTCTCTGCCTCACAGAAATCTCCCCCCTCGCCCAAACCCTCGCC GCCGCCATGGCCGCCGCCACCTCCCCCGCCGTCGCATTCTCGGGCGCCACCGCCG CCGCCATGCCCAAACCCGCCCGCCATCCTCTCCCGCGCCACCAGCCCGTCTCGCG CCGCGCGCTCCCCGCCCGCGTCGTCAGGTGTTGCGCCGCGTCCCCCGCCGCCACC TCCGCCGCGCCTCCCGCAACCGCGCTCCGGCCCTGGGGCCCGTCCGAGCCCCGCA AGGGCGCCGACATCCTCGTCGAGGCGCTCGAGCGCTGCGGCATCGTCGACGTCTT CGCCTACCCCGGCGGCGCCTCCATGGAGATCCACCAGGCGCTGACGCGCTCGCCC GTCATCACCAACCACCTCTTCCGCCACGAGCAGGGGGAGGCGTTCGCGGCGTCCG GCTACGCCCGCGCGTCCGGCCGCGTCGGCGTCTGCGTCGCCACCTCCGGCCCGGG GGCCACCAACCTCGTCTCCGCGCTCGCCGACGCCCTCCTCGACTCCATCCCCATG GTCGCCATCACGGGCCAGGTCCCCCGCCGCATGATCGGCACGGACGCGTTCCAGG AGACGCCCATAGTGGAGGTC ACGCGCTCCATCACCAAGCACAACTACCTGGTCCT TGACGTGGAGGATATCCCCCGCGTCATCCAGGAAGCCTTCTTCCTTGCATCCTCTG GCCGCCCGGGGCCGGTGCTAGTTGATATCCCCAAGGACATCCAGCAGCAGATGG CTGTGCCCGTCTGGGACACTCCAATGAGTTTGCCAGGGTACATCGCCCGCCTGCC CAAGCCACCATCTACTGAATCGCTTGAGCAGGTCCTGCGTCTGGTTGGCGAGTCA CGGCGCCCAATTCTGTATGTTGGTGGTGGCTGCGCTGCGTCTGGCGAGGAGTTGC GCCGCTTTGTTGAGCTTACTGGGATTCCAGTTACAACTACTCTGATGGGCCTTGGC AACTTCCCCAGCGACGACCCACTGTCTCTGCGCATGCTTGGGATGCATGGCACTG TGTATGCAAATTATGCAGTAGATAAGGCTGACCTGTTGCTCGCATTTGGTGTGCG GTTTGATGATCGTGTGACTGGGAAAATCGAGGCTTTTGCAAGCAGGTCCAAGATT GTGCACATTGACATTGACCCAGCTGAGATTGGCAAGAACAAGCAGCCACATGTCT CCATTTGTGCAGATGTTAAGCTTGCTTTACAGGGGTTGAATGATCTATTAAATGG GAGCAAAGCACAACAGGGTCTGGATTTTGGTCCATGGCACAAGGAGTTGGATCA GCAGAAGAGGGAGTTTCCTCTAGGATTCAAGACTTTTGGCGAGGCCATCCCGCCG CAATATGCTATCCAGGTACTGGATGAGCTGACAAAAGGGGAGGCGATCATTGCC ACTGGTGTTGGGCAGCACCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGC CACGGCAGTGGCTGTCTTCGTCTGGTTTGGGGGCAATGGGATTTGGGTTACCAGC TGCAGCTGGCGCTGCTGTGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGT GATGGTAGTTTCCTCATGAACATTCAGGAGTTGGCGTTGATCCGCATTGAGAACC TCCCAGTGAAGGTGATGATATTGAACAACCAGCATCTGGGAATGGTGGTGCAGT GGGAGGATAGGTTTTACAAGGCCAATCGGGCGCACACATACCTTGGCAACCCAG AAAATGAGAGTGAGATATATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGT TCCAGCAGTTCGAGTGACGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGAT GCTTGAGACCCCAGGGCCATACTTGTTGGATATCATAGTCCCGCATCAGGAGCAC GTGCTGCCTATGATCCCAAGCGGTGGTGCTTTCAAGGACATGATCATGGAGGGTG ATGGCAGGACCTCGTACTGAAATTTCGACCTACAAGACCTACAAGTGTGACATGC GCAATCAGCATGATGCCCGCGTGTTGTATCAACTACTAGGGGTTCAACTGTGAGC CATGCGTTTTCTAGTTTGCTTGTTTCATTCATATAAGCTTGTATTACTTAGTTCCGA ACCCTGTAGTTTTGTAGTCTATGTTCTCTTTTGTAGGGATGTGCTGTCATAAGATG TCATGCAAGTTTCTTGTCCTACATATCAATAATAAGTACTTCCATGGAATAATTCT CAGTTCTGTTTTGAATTTTGCATCTTCTCACAAACAGTGTGCTGGTTCCTTTCTGTT ACTTTACATGTCTGCTGTGTCAGGTTCTGACATAACGACCGATGGAGGGTGGTCG GCAGGTTTTAGAAGGGGAATTGAAACTTTTTTTTGGGAAGAAGTCTGAATACAGT TGGGAGGAAAAATAGAAGTATATACTTCGATTAATTTATCAAGCCCGCTATCCAG TCTAATTTATCAAGCACTAGACAGTGTAGGGTGTTGGCATTCTTCTCTTCCTTGAG ATCCGGCTTGAGAGGAGAGACCGAGGCTTCGGCTGTGTTGGTTGCTGATTTCTAC AGCTTTTTGAGATAGAGAGAGAGATCCTGCAACTGTGGTTTGTCTTGCTGCTTGT ACAGCGAGAGAGACATTGAGAGATATGTAGATCGTTTACCAGTTGTGCTGCTGTT ATTCGTACTGGTACTGATTGTTGTTACTGTTGCTATCATGTGCAAATTGTTGTGAT GGAAAATCAACAAAATTTTGATATTTTGCAAAGCGAGTTGGATTGAATGATTTGA GAAATGGTGACTGCTTTCCCTCAGACTTGTTGAGTGGCCTTGAGAATTGGTGTTTC ATAGGTGGTGTATGCAGTTGCTAATGAAAGGCGACGGCTTGAAATTTCCGAAAG GCAGCCAATGATACTTTCTGAAAGTGATGTTTTTTTCGTCCAGGTTTCCGGTGGAG CAAGTCTAGACACACGTTGAGCCAATGTTTGTCAGCTTATTCTGCTCTTTAGTTTC AGTTTAGGTGCAGTTGTTTTGTTTACAGATTGCTGGGCAGAGCCCCGTGATCGGC TGAGCCTCCAAGAGATCCT

[0460] SEQ ID NO:175

tnantggtta ggtgctggtg gtccgaaggt ccacgccgcc aactacg

[0461] SEQ ID NO: 176

CNANTACGTAGTTGGCGGCGTGGACCTTCGGACCACCAGCACCTAAC

[0462] SEQ ID NO: 177

TNANTGGTTAGGTGCTGGTGGTCCGAAGGTCCACGCCGCCAACTACG

[0463] SEQ ID O: 178

ANGNGTCGTAGTTGGCGGCGTGGACCTTCGGACCACCAGCACCTAAC

[0464] SEQ ID NO: 179

TGGATATCATAGTCCCGCATCAGGAGCACGTGCTGCCTATGATCCCAAGCGGTGG TGCTTTCAAGGACATGATCATGGGTTAGGTGCTGGTGGTCCGAAGGTCCACGCCG CCAACTACGTGGATATCATAGTCCCGCATCAGGAGCACGTGCTGCCTATGATCCC AAGCGGTGGTGCTTTCAAGGACATGATCATGG

[0465] SEQ DD NO: 180

CCATGATCATGTCCTTGAAAGCACCACCGCTTGGGATCATAGGCAGCACGTGCTC CTGATGCGGGACTATGATATCCACGTAGTTGGCGGCGTGGACCTTCGGACCACCA GCACCTAACCCATGATCATGTCCTTGAAAGCACCACCGCTTGGGATCATAGGCAG CACGTGCTCCTGATGCGGGACTATGATATCCA

[0466] SEQ lD NO: 181

TGGATATCATAGTCCCGCATCAGGAGCACGTGCTGCCTATGATCCCAAGCGGTGG TGCTTTCAAGGACATGATCATGGGTTAGGTGCTGGTGGTCCGAAGGTCCACGCCG CCAACTACGGATGGCAGGACCTCGTACTGAAATTTCGACCTACAAGACCTACAAG TGTGACATGCGCAATCAGCATGATGCCCGCGT [0467] SEQ BD O: 182

ACGCGGGCATCATGCTGATTGCGCATGTCACACTTGTAGGTCTTGTAGGTCGAAA TTTCAGTACGAGGTCCTGCCATCCGTAGTTGGCGGCGTGGACCTTCGGACCACCA GCACCTAACCCATGATCATGTCCTTGAAAGCACCACCGCTTGGGATCATAGGCAG CACGTGCTCCTGATGCGGGACTATGATATCCA

[0468] SEQ ID NO: 183

GGAGTTGGCGTTGATCCGNC

[0469] SEQ ID NO: 184

AACTACAGGGTTCGGAACTAAGTAANT

[0470] SEQ ID NO: 185

TGGCAGGATATATTGTGGTGTAAACAAATTGACGCTTAGACAACTTAATAACACA TTGCGGACGTTTTTAATGTACTGAATTAACGCCGAATTGAATTCGAGCTCGGTAC CACTGGATTTTGGTTTTAGGAATTAGAAATTTTATTGATAGAAGTATTTTACAAAT ACAAATACATACTAAGGGTTTCTTATATGCTCAACACATGAGCGAAACCCTATAA GAACCCTAATTCCCTTATCTGGGAACTACTCACACATTATTCTGGAGAAAAATAG AGAGAGATAGATTTGTAGAGAGAGACTGGTGATTTTTGCGGACTCTATTAGATCT GGGTAACTGGCCTAACTGGCCTTGGAGGAGCTGGCAACTCAAAATCCCTTTGCCA AAAACCAACATCATGCCATCCACCATGCTTGTATCCAGCTGCGCGCAATGTACCC CGGGCTGTGTATCCCAAAGCCTCATGCAACCTAACAGATGGATCGTTTGGAAGGC CTATAACAGCAACCACAGACTTAAAACCTTGCGCCTCCATAGAC1TAAGCAAATG TGTGTACAATGTGGATCCTAGGCCCAACCTTTGATGCCTATGTGACACGTAAACA GTACTCTCAACTGTCCAATCGTAAGCGTTCCTAGCCTTCCAGGGCCCAGCGTAAG CAATACCAGCCACAACACCCTCAACCTCAGCAACCAACCAAGGGTATCTATCTTG CAACCTCTCGAGATCATCAATCCACTCTTGTGGTGTTTGTGGCTCTGTCCTAAAGT TCACTGTAGACGTCTCAATGTAATGGTTAACGATATCACAAACCGCGGCCATATC AGCTGCTGTAGCTGGCCTAATCTCAACTGGTCTCCTCTCCGGAGACATGGCTTCTA CCTACAAAAAAGCTCCGCACGAGGCTGCATTTGTCACAAATCATGAAAAGAAAA ACTACCGATGAACAATGCTGAGGGATTCAAATTGTACCCACAAAAAGAAGAAAG AAAGATCTAGCACATCTAAGCCTGACGAAGCAGCAGAAATATATAAAAATATAA ACCATAGTGCCCTTTTCCCCTCTTCCTGATCTTGTTTAGCATGGCGGAAATTTTAA ACCCCCCATCATCTCCCCCAACAACGGCGGATCGCAGATCTACATCCGAGAGCCC CATTCCCCGCGAGATCCGGGCCGGATCCACGCCGGCGAGAGCCCCAGCCGCGAG ATCCCGCCCCTCCCGCGCACCGATCTGGGCGCGCACGAAGCCGCCTCTCGCCCAC CCAAACTACCAAGGCCAAAGATCGAGACCGAGACGGAAAAAAAAAACGGAGAA AGAAAGAGGAGAGGGGCGGGGTGGTTACCGGCGCGGCGGCGGCGGAGGGGGAG GGGGGAGGAGCTCGTCGTCCGGCAGCGAGGGGGGAGGAGGTGGAGGTGGTGGT GGTGGTGGTGGTAGGGTTGGGGGGATGGGAGGAGAGGGGGGGGTATGTATATAG TGGCGATGGGGGGCGTTTCTTTGGAAGCGGAGGGAGGGCCGGCCTCGTCGCTGG CTCGCGATCCTCCTCGCGTTTCCGGCCCCCACGACCCGGACCCACCTGCTGTTTTT TCTTTTTCTTTTTTTTCTTTCTTTTTTTTTTTTTGGCTGCGAGACGTGCGGTGCGTGC GGACAACTCACGGTGATAGTGGGGGGGTGTGGAGACTATTGTCCAGTTGGCTGG ACTGGGGTGGGTTGGGTTGGGTTGGGTTGGGCTGGGCTTGCTATGGATCGTGGAT AGCACTTTGGGCTTTAGGAACTTTAGGGGTTGTTTTTGTAAATGTTTTGAGTCTAA GTTTATCTTTTATTTTTACTAGAAAAAATACCCATGCGCTGCAACGGGGGAAAGC TATTTTAATCTTATTATTGTTCATTGTGAGAATTCGCCTGAATATATATTTTTCTCA AAAATTATGTCAAATTAGCATATGGGTTTTTTTAAAGATATTTCTTATACAAATCC CTCTGTATTTACAAAAGCAAACGAACTTAAAACCCGACTCAAATACAGATATGCA TTTCCAAAAGCGAATAAACTTAAAAACCAATTCATACAAAAATGACGTATCAAA GTACCGACAAAAACATCCTCAATTTTTATAATAGTAGAAAAGAGTAAATTTCACT TTGGGCCACCTTTTATTACCGATATTTTACTTTATACCACCTTTTAACTGATGTTTT CACTTTTGACCAGGTAATCTTACCTTTGTTTTATTTTGGACTATCCCGACTCTCTTC TCAAGCATATGAATGACCTCGAGTATGCTAGTCTAGAGTCGACCTGCAGGGTGCA GCGTGACCCGGTCGTGCCCCTCTCTAGAGATAATGAGCATTGCATGTCTAAGTTA TAAAAAATTACCACATATTTTTTTTGTCACACTTGTTTGAAGTGCAGTTTATCTAT CTTTATACATATATTTAAACTTTACTCTACGAATAATATAATCTATAGTACTACAA TAATATCAGTGTTTTAGAGAATCATATAAATGAACAGTTAGACATGGTCTAAAGG ACAATTGAGTATTTTGACAACAGGACTCTACAGTTTTATCTTTTTAGTGTGCATGT GTTCTCCTTTTTTTTTGCAAATAGCTTCACCTATATAATACTTCATCCATTTTATTA GTACATCCATTTAGGGTTTAGGGTTAATGGTTTTTATAGACTAATTTTTTTAGTAC ATCTATTTTATTCTATTTTAGCCTCTAAATTAAGAAAACTAAAACTCTATTTTAGT TTTTTTATTTAATAATTTAGATATAAAATAGAATAAAATAAAGTGACTAAAAATT AAACAAATACCCTTTAAGAAATTAAAAAAACTAAGGAAACATTTTTCTTGTTTCG AGTAGATAATGCCAGCCTGTTAAACGCCGTCGACGAGTCTAACGGACACCAACC AGCGAACCAGCAGCGTCGCGTCGGGCCAAGCGAAGCAGACGGCACGGCATCTCT GTCGCTGCCTCTGGACCCCTCTCGAGAGTTCCGCTCCACCGTTGGACTTGCTCCGC TGTCGGCATCCAGAAATTGCGTGGCGGAGCGGCAGACGTGAGCCGGCACGGCAG GCGGCCTCCTCCTCCTCTCACGGCACGGCAGCTACGGGGGATTCCTTTCCCACCG CTCCTTCGCTTTCCCTTCCTCGCCCGCCGTAATAAATAGACACCCCCTCCACACCC TCTTTCCCCAACCTCGTGTTGTTCGGAGCGCACACACACACAACCAGATCTCCCC CAAATCCACCCGTCGGCACCTCCGCTTCAAGGTACGCCGCTCGTCCTCCCCCCCC CCCCCTCTCTACCTTCTCTAGATCGGCGTTCCGGTCCATGGTTAGGGCCCGGTAGT TCTACTTCTGTTCATGTTTGTGTTAGATCCGTGTTTGTGTTAGATCCGTGCTGCTAG CGTTCGTACACGGATGCGACCTGTACGTCAGACACGTTCTGATTGCTAACTTGCC AGTGTTTCTCTTTGGGGAATCCTGGGATGGCTCTAGCCGTTCCGCAGACGGGATC GATTTCATGATTTTTTTTGTTTCGTTGCATAGGGTTTGGTTTGCCCTTTTCCTTTAT TTCAATATATGCCGTGCACTTGTTTGTCGGGTCATCTTTTCATGCTTTTTTTTGTCT TGGTTGTGATGATGTGGTCTGGTTGGGCGGTCGTTCTAGATCGGAGTAGAATTCT GTTTCAAACTACCTGGTGGATTTATTAATTTTGGATCTGTATGTGTGTGCCATACA TATTCATAGTTACGAATTGAAGATGATGGATGGAAATATCGATCTAGGATAGGTA TACATGTTGATGCGGGTTTTACTGATGCATATACAGAGATGCTTTTTGTTCGCTTG GTTGTGATGATGTGGTGTGGTTGGGCGGTCGTTCATTCGTTCTAGATCGGAGTAG AATACTGTTTCAAACTACCTGGTGTATTTATTAATTTTGGAACTGTATGTGTGTGT CATACATCTTCATAGTTACGAGTTTAAGATGGATGGAAATATCGATCTAGGATAG GTATACATGTTGATGTGGGTTTTACTGATGCATATACATGATGGCATATGCAGCA TCTATTCATATGCTCTAACCTTGAGTACCTATCTATTATAATAAACAAGTATGTTT TATAATTATTTTGATCTTGATATACTTGGATGATGGCATATGCAGCAGCTATATGT GGATTTTTTTAGCCCTGCCTTCATACGCTATTTATTTGCTTGGTACTGTTTCTTTTG TCGATGCTCACCCTGTTGTTTGGTGTTACTTCTGCAGGAGGATCACAAGTTTGTAC AAAAAAGCAGGCTATGGCCGCCGCCACCTCCCCCGCCGTCGCATTCTCGGGCGCC ACCGCCGCCGCCATGCCCAAACCCGCCCGCCATCCTCTCCCGCGCCACCAGCCCG TCTCGCGCCGCGCGCTCCCCGCCCGCGTCGTCAGGTGTTGCGCCGCGTCCCCCGC CGCCACCTCCGCCGCGCCTCCCGCAACCGCGCTCCGGCCCTGGGGCCCGTCCGAG CCCCGCAAGGGCGCCGACATCCTCGTCGAGGCGCTCGAGCGCTGCGGCATCGTCG ACGTCTTCGCCTACCCCGGCGGCGCCTCCATGGAGATCCACCAGGCGCTGACGCG CTCGCCCGTCATCACCAACCACCTCTTCCGCCACGAGCAGGGGGAGGCGTTCGCG GCGTCCGGCTACGCCCGCGCGTCCGGCCGCGTCGGCGTCTGCGTCGCCACCTCCG GCCCGGGGGCCACCAACCTCGTCTCCGCGCTCGCCGACGCCCTCCTCGACTCCAT CCCCATGGTCGCCATCACGGGCCAGGTCCCCCGCCGCATGATCGGCACGGACGCG TTCCAGGAGACGCCCATAGTGGAGGTCACGCGCTCCATCACCAAGCACAACTACC TGGTCCTTGACGTGGAGGATATCCCCCGCGTCATCCAGGAAGCCTTCTTCCTTGC ATCCTCTGGCCGCCCGGGGCCGGTGCTAGTTGATATCCCCAAGGACATCCAGCAG CAGATGGCTGTGCCCGTCTGGGACACTCCAATGAGTTTGCCAGGGTACATCGCCC GCCTGCCCAAGCCACCATCTACTGAATCGCTTGAGCAGGTCCTGCGTCTGGTTGG CGAGTCACGGCGCCCAATTCTGTATGTTGGTGGTGGCTGCGCTGCGTCTGGCGAG GAGTTGCGCCGCTTTGTTGAGCTTACTGGGATTCCAGTTACAACTACTCTGATGG GCCTTGGCAACTTCCCCAGCGACGACCCACTGTCTCTGCGCATGCTTGGGATGCA TGGCACTGTGTATGCAAATTATGCAGTAGATAAGGCTGACCTGTTGCTCGCATTT GGTGTGCGGTTTGATGATCGTGTGACTGGGAAAATCGAGGCTTTTGCAAGCAGGT CCAAGATTGTGCACATTGACATTGACCCAGCTGAGATTGGCAAGAACAAGCAGC CACATGTCTCCATTTGTGCAGATGTTAAGCTTGCTTTACAGGGGTTGAATGATCTA TTAAATGGGAGCAAAGCACAACAGGGTCTGGATTTTGGTCCATGGCACAAGGAG TTGGATCAGCAGAAGAGGGAGTTTCCTCTAGGATTCAAGACTTTTGGCGAGGCCA TCCCGCCGCAATATGCTATCCAGGTACTGGATGAGCTGACAAAAGGGGAGGCGA TCATTGCCACTGGTGTTGGGCAGCACCAGATGTGGGCGGCTCAGTATTACACTTA CAAGCGGCCACGGCAGTGGCTGTCTTCGTCTGGTTTGGGGGCAATGGGATTTGGG TTACCAGCTGCAGCTGGCGCTGCTGTGGCCAACCCAGGTGTTACAGTTGTTGACA TTGATGGTGATGGTAGTTTCCTCATGAACATTCAGGAGTTGGCGTTGATCCGCATT GAGAACCTCCCAGTGAAGGTGATGATATTGAACAACCAGCATCTGGGAATGGTG GTGCAGTGGGAGGATAGGTTTTACAAGGCCAATCGGGCGCACACATACCTTGGC AACCCAGAAAATGAGAGTGAGATATATCCAGATTTTGTGACGATTGCTAAAGGA TTCAACGTTCCAGCAGTTCGAGTGACGAAGAAGAGCGAAGTCACTGCAGCAATC AAGAAGATGCTTGAGACCCCAGGGCCATACTTGTTGGATATCATAGTCCCGCATC AGGAGCACGTGCTGCCTATGATCCCAAATGGTGGTGCTTTCAAGGACATGATCAT GGAGGGTGATGGCAGGACCTCGTACTGATACCCAGCTTTCTTGTACAAAGTGGTG ATCCTACTAGTAGAAGGAGTGCGTCGAAGCAGATCGTTCAAACATTTGGCAATAA AGTTTCTTAAGATTGAATCCTGTTGCCGGTCTTGCGATGATTATCATATAATTTCT GTTGAATTACGTTAAGCATGTAATAATTAACATGTAATGCATGACGTTATTTATG AGATGGGTTTTTATGATTAGAGTCCCGCAATTATACATTTAATACGCGATAGAAA ACAAAATATAGCGCGCAAACTAGGATAAATTATCGCGCGCGGTGTCATCTATGTT ACTAGATCGAAAGCTTAGCTTGAGCTTGGATCAGATTGTCGTTTCCCGCCTTCAGT TTAAACTATCAGTGTTTGACAGGATATATTGGCGGGTAAAC

[0471] SEQ ID O: 186

ATTTTCCATTCACTTGGCCC

[0472] SEQ DD NO:187

TGCTATCTGGCTCAGCTGC

[0473] SEQ ID NO: 188

ATGGTGGAAGGGCGGTTGTGA

[0474] SEQ ID NO: 189

CTCCCGCGCACCGATCTG

[0475] SEQ ID NO: 190

CCCGCCCCTCTCCTCTTTC

[0476] SEQ ID NO: 191

AAGCCGCCTCTCGCCCACCCA

[0477] SEQ ID NO: 192

AYCAGATGTGGGCGGCTCAGTAT

[0478] SEQ ID NO: 193

GGGATATGTAGGACAAGAAACTTGCATGA [0479] SEQ ID NO: 194

ACCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC TTCGTCTGGTTGGGGGCAATGGGATTTGGGTTACCAGCTGCAGCTGGCGCTGCTG TGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGAGATGGTAGTTTCCTCAT GAACATTCAGGAGTTGGCATTGATCCGTATTGAGAACCTCCCTGTGAAGGTGATG ATATTGAACAACCAGCATCTGGGAATGGTGGTGCAATGGGAGGATAGGTTTTAC AAGGCCAATCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGATA TATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCGGCAGTTCGTGTGA CGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGGC CATACTTGTTGGATATCATCGTCCCGCATCAGGAGCACGTGCTGCCTATGATtCCc AatGGcGGcGCTTTCAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGTACTG AAATGGTCCGAAGGTCCACGCCGCCAACTACGAGTATGATCCCAAGCGGTGGTG CTTTCAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGTACTGAAATTTCGA CCTACAAGACCTACAAGTGTGACATGCGCAATCAGCATGGTGCCCGCGTGTTGTA TCAACTACTAGGGGTTCAACTGTGAACCATGCGTTTTCTAGTTTGCTTGTTTCATT CATATAAGCTTGTGTTACTTAGTTCCGAACCCTGTAGCTTTGTAGTCTATGCTCTC TTTTGTAGGGATGTGCTGTCATAAGATATCATGCAAGTTTCTTGTCCTACATATC

[0480] SEQ ID NO: 195

ACCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC TTCGTCTGGTTGGGGGCAATGGGATTTGGGTTACCAGCTGCAGCTGGCGCTGCTG TGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGAGATGGTAGTTTCCTCAT GAACATTCAGGAGTTGGCATTGATCCGTATTGAGAACCTCCCTGTGAAGGTGATG ATATTGAACAACCAGCATCTGGGAATGGTGGTGCAATGGGAGGATAGGTTTTAC AAGGCCAATCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGATA TATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCGGCAGTTCGTGTGA CGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGGC CATACTTGTTGGATATCATCGTCCCGCATCAGGAGCACGTGCTGCCTTCCCAAGC GGTGGTGCTTTCAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGTACTGA AATTTCGACCTACAAGACCTACAAGTGTGACATGCGCAATCAGCATGGTGCCCGC GTGTTGTATCAACTACTAGGGGTTCAACTGTGAACCATGCGTTTTCTAGTTTGCTT GTTTCATTCATATAAGCTTGTGTTACTTAGTTCCGAACCCTGTAGCTTTGTAGTCT ATGCTCTCTTTTGTAGGGATGTGCTGTCATAAGATATCATGCAAGTTTCTTGTCCT ACATATC

[0481] SEQ ID NO: 196

ATCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC

TTCATCCGGTTTGGGTGCAATGGGATTTGGGTTGCCAGCTGCAGCTGGCGCTGCT

GTGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGGGATGGTAGTTTCCTCA TGAACATTCAGGAGTTGGCGTTGATCCGTATTGAGAACCTCCCAGTGAAGGTGAT GATATTGAACAACCAGCATCTGGGAATGGTGGTGCAGTGGGAGGATAGGTTTTA CAAGGCCAACCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGAT ATATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCGGCAGTTCGTGTG ACGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGG CCATACTTGTTGGATATCATTGTCCCGCATCAGGAGCACGTGAatGGcGGcGCTTTC AAGGACATGATCATGGAGGGTGATGGCAGGACCTCGTACTGAAATGGTCCGAAG GTCCACGCCGCCAACTACGAGTATGATCCCAAGCGGTGGTGCTTTTAAGGACATG ATCATGGAGGGTGATGGCAGGACCTCGTACTGAAATTTCGACCTACAAGACCTAC AAGTGTGACATGCGCAATCAGCATGATACCTGCGTGTTGTATCAACTACTGGGGG TTCAACTGTGAACCATGCGTTTTCTAGTTTGCTTGTTTCATTCATATAAGCTTGTGT TACTTAGTTCCGAACCGTGTAGTTTTGTAGTCTCTGTTCTCTTTTGTAGGGATGTG CTGTCATAAGATATCATGCAAGTTTCTTGTCCTACATATC [0482] SEQ ID NO: 197

ATCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC TTCATCCGGTTTGGGTGCAATGGGATTTGGGTTGCCAGCTGCAGCTGGCGCTGCT GTGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGGGATGGTAGTTTCCTCA TGAACATTCAGGAGTTGGCGTTGATCCGTATTGAGAACCTCCCAGTGAAGGTGAT GATATTGAACAACCAGCATCTGGGAATGGTGGTGCAGTGGGAGGATAGGTTTTA CAAGGCCAACCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGAT ATATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCGGCAGTTCGTGTG ACGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGG CCATACTTGTTGGATATCATTGTCCCGCATCAGGAGCACGTGCTGCCTATGATCCC AAGCGGTGGTGCTTTTAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGTA CTGAAATTTCGACCTACAAGACCTACAAGTGTGACATGCGCAATCAGCATGATAC CTGCGTGTTGTATCAACTACTGGGGGTTCAACTGTGAACCATGCGTTTTCTAGTTT GCTTGTTTCATTCATATAAGCTTGTGTTACTTAGTTCCGAACCGTGTAGTTTTGTA GTCTCTGTTCTCTTTTGTAGGGATGTGCTGTCATAAGATATCATGCAAGTTTCTTG TCCTACATATC

[0483] SEQ ID O: 198

ACCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC TTCGTCTGGTTTGGGGGCAATGGGATTTGGGTTACCAGCTGCAGCTGGCGCTGCT GTGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGTGATGGTAGTTTCCTCA TGAACATTCAGGAGTTGGCGTTGATCCGCATTGAGAACCTCCCAGTGAAGGTGAT GATATTGAACAACCAGCATCTGGGAATGGTGGTGCAGTGGGAGGATAGGTTTTA CAAGGCCAATCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGAT ATATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCAGCAGTTCGAGTG ACGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGG CCATACTTGTTGGATATCATAGTCCCGCATCAGGAGCACGTGCTGCCTAatGGcGGc GCTTTCAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGTACTGAAATGGTC CGAAGGTCCACGCCGCCAACTACGAGTCCCAAGCGGTGGTGCTTTCAAGGACAT GATCATGGAGGGTGATGGCAGGACCTCGTACTGAAATTTCGACCTACAAGACCTA CAAGTGTGACATGCGCAATCAGCATGATGCCCGCGTGTTGTATCAACTACTAGGG GTTCAACTGTGAGCCATGCGTTTTCTAGTTTGCTTGTTTCATTCATATAAGCTTGT ATTACTTAGTTCCGAACCCTGTAGTTTTGTAGTCTATGTTCTCTTTTGTAGGGATG TGCTGTCATAAGATGTCATGCAAGTTTCTTGTCCTACATATC

[0484] SEQ ID NO: 199

ACCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC

TTCGTCTGGTTTGGGGGCAATGGGATTTGGGTTACCAGCTGCAGCTGGCGCTGCT

GTGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGTGATGGTAGTTTCCTCA TGAACATTCAGGAGTTGGCGTTGATCCGCATTGAGAACCTCCCAGTGAAGGTGAT GATATTGAACAACCAGCATCTGGGAATGGTGGTGCAGTGGGAGGATAGGTTTTA CAAGGCCAATCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGAT ATATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCAGCAGTTCGAGTG ACGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGG CCATACTTGTTGGATATCATAGTCCCGCATCAGGAGCACGTGCTGCCTATGATCC CAAGCGGTGGTGCTTTCAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGT ACTGAAATTTCGACCTACAAGACCTACAAGTGTGACATGCGCAATCAGCATGATG CCCGCGTGTTGTATCAACTACTAGGGGTTCAACTGTGAGCCATGCGTTTTCTAGTT TGCTTGTTTCATTCATATAAGCTTGTATTACTTAGTTCCGAACCCTGTAGTTTTGTA GTCTATGTTCTCTTTTGTAGGGATGTGCTGTCATAAGATGTCATGCAAGTTTCTTG TCCTACATATC [0485] SEQ DD NO:200

ACCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC TTCGTCTGGTTGGGGGCAATGGGATTTGGGTTACCAGCTGCAGCTGGCGCTGCTG TGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGAGATGGTAGTTTCCTCAT GAACATTCAGGAGTTGGCATTGATCCGTATTGAGAACCTCCCTGTGAAGGTGATG ATATTGAACAACCAGCATCTGGGAATGGTGGTGCAATGGGAGGATAGGTTTTAC AAGGCCAATCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGATA TATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCGGCAGTTCGTGTGA CGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGGC CATACTTGTTGGATATCATCGTCCCGCATCAGGAGCACAAGCGGTGGTGCTTTCA AGGACATGATCATGGAGGGTGATGGCAGGACCTCGTACTGAAATTTCGACCTAC AAGACCTACAAGTGTGACATGCGCAATCAGCATGGTGCCCGCGTGTTGTATCAAC TACTAGGGGTTCAACTGTGAACCATGCGTTTTCTAGTTTGCTTGTTTCATTCATAT AAGCTTGTGTTACTTAGTTCCGAACCCTGTAGCTTTGTAGTCTATGCTCTCTTTTGT AGGGATGTGCTGTCATAAGATATCATGCAAGTTTCTTGTCCTACATATC

[0486] SEQ ID NO:201

ACCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC TTCGTCTGGTTGGGGGCAATGGGATTTGGGTTACCAGCTGCAGCTGGCGCTGCTG TGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGAGATGGTAGTTTCCTCAT GAACATTCAGGAGTTGGCATTGATCCGTATTGAGAACCTCCCTGTGAAGGTGATG ATATTGAACAACCAGCATCTGGGAATGGTGGTGCAATGGGAGGATAGGTTTTAC AAGGCCAATCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGATA TATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCGGCAGTTCGTGTGA CGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGGC CATACTTGTTGGATATCATCGTCCCGCATCAGGAGCACGTGCTGCCTATGATCCC AAGCGGTGGTGCTTTCAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGTA CTGAAATTTCGACCTACAAGACCTACAAGTGTGACATGCGCAATCAGCATGGTGC CCGCGTGTTGTATCAACTACTAGGGGTTCAACTGTGAACCATGCGTTTTCTAGTTT GCTTGTTTCATTCATATAAGCTTGTGTTACTTAGTTCCGAACCCTGTAGCTTTGTA GTCTATGCTCTCTTTTGTAGGGATGTGCTGTCATAAGATATCATGCAAGTTTCTTG TCCTACATATC

[0487] SEQ ID NO:202

[0488] SEQ DD NO:203

ATCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC

TTCATCCGGTTTGGGTGCAATGGGATTTGGGTTGCCAGCTGCAGCTGGCGCTGCT

GTGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGGGATGGTAGTTTCCTCA TGAACATTCAGGAGTTGGCGTTGATCCGTATTGAGAACCTCCCAGTGAAGGTGAT GATATTGAACAACCAGCATCTGGGAATGGTGGTGCAGTGGGAGGATAGGTTTTA CAAGGCCAACCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGAT ATATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCGGCAGTTCGTGTG ACGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGG CCATACTTGTTGGATATCATTGTCCCGCATCAGGAGCACGTGCTGCCTATGATCCC AAGCGGTGGTGCTTTTAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGTA CTGAAATTTCGACCTACAAGACCTACAAGTGTGACATGCGCAATCAGCATGATAC CTGCGTGTTGTATCAACTACTGGGGGTTCAACTGTGAACCATGCGTTTTCTAGTTT GCTTGTTTCATTCATATAAGCTTGTGTTACTTAGTTCCGAACCGTGTAGTTTTGTA GTCTCTGTTCTCTTTTGTAGGGATGTGCTGTCATAAGATATCATGCAAGTTTCTTG TCCTACATATC

[0489] SEQ ID NO:204

ACCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC TTCGTCTGGTTTGGGGGCAATGGGATTTGGGTTACCAGCTGCAGCTGGCGCTGCT GTGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGTGATGGTAGTTTCCTCA TGAACATTCAGGAGTTGGCGTTGATCCGCATTGAGAACCTCCCAGTGAAGGTGAT GATATTGAACAACCAGCATCTGGGAATGGTGGTGCAGTGGGAGGATAGGTTTTA CAAGGCCAATCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGAT ATATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCAGCAGTTCGAGTG ACGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGG CCATACTTGTTGGATATCATAGTCCCGCATCAGGAGCACGTGCTGCCTATGATtCC cAatGGcGGcGCTTTCAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGTACT GAAATGGTCCGAAGGTCCACGCCGCCAACTACGAGTCCCAAGCGGTGGTGCTTTC AAGGACATGATCATGGAGGGTGATGGCAGGACCTCGTACTGAAATTTCGACCTA CAAGACCTACAAGTGTGACATGCGCAATCAGCATGATGCCCGCGTGTTGTATCAA CTACTAGGGGTTCAACTGTGAGCCATGCGTTTTCTAGTTTGCTTGTTTCATTCATA TAAGCTTGTATTACTTAGTTCCGAACCCTGTAGTTTTGTAGTCTATGTTCTCTTTTG TAGGGATGTGCTGTCATAAGATGTCATGCAAGTTTCTTGTCCTAC ATATC

[0490] SEQ ID NO:205

ACCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC TTCGTCTGGTTTGGGGGCAATGGGATTTGGGTTACCAGCTGCAGCTGGCGCTGCT GTGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGTGATGGTAGTTTCCTCA TGAACATTCAGGAGTTGGCGTTGATCCGCATTGAGAACCTCCCAGTGAAGGTGAT GATATTGAACAACCAGCATCTGGGAATGGTGGTGCAGTGGGAGGATAGGTTTTA CAAGGCCAATCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGAT ATATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCAGCAGTTCGAGTG ACGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGG CCATACTTGTTGGATATCATAGTCCCGCATCAGGAGCACGTGCTGCCTATGATCC CAAGCGGTGGTGCTTTCAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGT ACTGAAATTTCGACCTACAAGACCTACAAGTGTGACATGCGCAATCAGCATGATG CCCGCGTGTTGTATCAACTACTAGGGGTTCAACTGTGAGCCATGCGTTTTCTAGTT TGCTTGTTTCATTCATATAAGCTTGTATTACTTAGTTCCGAACCCTGTAGTTTTGTA GTCTATGTTCTCTTTTGTAGGGATGTGCTGTCATAAGATGTCATGCAAGTTTCTTG TCCTACATATC

[0491] SEQ ID NO:206

[0492] SEQ ID NO:207

[0493] SEQ ID NO:208

[0494] SEQ ID NO:209

[0495] SEQ DD NO:210

ACCAGATGTGGGGGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC TTCGTCTGGTTTGGGGGCAATGGGATTTGGGTTACCAGCTGCAGCTGGCGCTGCT GTGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGTGATGGTAGTTTCCTCA TGAACATTCAGGAGTTGGCGTTGATCCGCATTGAGAACCTCCCAGTGAAGGTGAT GATATTGAACAACCAGCATCTGGGAATGGTGGTGCAGTGGGAGGATAGGTTTTA CAAGGCCAATCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGAT ATATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCAGCAGTTCGAGTG ACGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGG CCATACTTGTTGGATATCATAGTCCCGCATCAGGAGCACGTGCTGCCTATGATCC CAAGCGGTGGTGCTTTCAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGT ACTGAAATTTCGACCTACAAGACCTACAAGTGTGACATGCGCAATCAGCATGATG CCCGCGTGTTGTATCAACTACTAGGGGTTCAACTGTGAGCCATGCGTTTTCTAGTT TGCTTGTTTCATTCATATAAGCTTGTATTACTTAGTTCCGAACCCTGTAGTTTTGTA GTCTATGTTCTCTTTTGTAGGGATGTGCTGTCATAAGATGTCATGCAAGTTTCTTG TCCTACATATC

[0496] SEQ ID NO:211

[0497] SEQ ID O:212

[0498] SEQ ID NO:213

[0499] SEQ∑D NO:214

ATCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC TTCATCCGGTTTGGGTGCAATGGGATTTGGGTTGCCAGCTGCAGCTGGCGCTGCT GTGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGGGATGGTAGTTTCCTCA TGAACATTCAGGAGTTGGCGTTGATCCGTATTGAGAACCTCCCAGTGAAGGTGAT GATATTGAACAACCAGCATCTGGGAATGGTGGTGCAGTGGGAGGATAGGTTTTA CAAGGCCAACCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGAT ATATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCGGCAGTTCGTGTG ACGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGG CCATACTTGTTGGATATCATTGTCCCGCATCAGGAGCACGTGCTGTAGTTGGCGG CGCTTTCAAGGACATGATCATGGAGGGTGATGKCAGGACCTCGTACTGAAATGGT CCGAAGGTCCACGCCTCGTATGAAATGGTCCGAAGGTCCACGCCGCCAACTACG AGNNl^MNNNTN^

CCCAATGGCGGCGCTTTCAAGGACATGATCATGGAGGGTGATGGCAGGACCTCG TACTGAAATGGTCCGAAGGTCCACGCCGCCAACTACGATGATCCCAAGCGGTGGT GCTTTTAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGTACTGAAATTTCG ACCTACAAGACCTACAAGTGTGACATGCGCAATCAGCATGATACCTGCGTGTTGT ATCAACTACTGGGGGTTCAACTGTGAACCATGCGTTTTCTAGTTTGCTTGTTTCAT TCATATAAGCTTGTGTTACTTAGTTCCGAACCGTGTAGTTTTGTAGTCTCTGTTCT CTTTTGTAGGGATGTGCTGTCATAAGATATCATGCAAGTTTCTTGTCCTACATATC

[0500] SEQ ID NO:215

[0501] SEQ K) NO:216

ACCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC TTCGTCTGGTTTGGGGGCAATGGGATTTGGGTTACCAGCTGCAGCTGGCGCTGCT GTGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGTGATGGTAGTTTCCTCA TGAACATTCAGGAGTTGGCGTTGATCCGCATTGAGAACCTCCCAGTGAAGGTGAT GATATTGAACAACCAGCATCTGGGAATGGTGGTGCAGTGGGAGGATAGGTTTTA CAAGGCCAATCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGAT ATATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCAGCAGTTCGAGTG ACGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGG CCATACTTGTTGGATATCATAGTCCCGCATCAGGAGCACGTGCTGCCTATGATtCC cAatGGcGGcGCTTTCAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGTACT GAAATGGTCCGAAGGTCCACGCCGCCAACTACGAGTATGATCCCAAGCGGTGGT GCTTTCAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGTACTGAAATTTCG ACCTACAAGACCTACAAGTGTGACATGCGCAATCAGCATGATGCCCGCGTGTTGT ATCAACTACTAGGGGTTCAACTGTGAGCCATGCGTTTTCTAGTTTGCTTGTTTCAT TCATATAAGCTTGTATTACTTAGTTCCGAACCCTGTAGTTTTGTAGTCTATGTTCT CTTTTGTAGGGATGTGCTGTCATAAGATGTCATGCAAGTTTCTTGTCCTACATATC

[0502] SEQ ED NO:217

ACCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC TTCGTCTGGTTTGGGGGCAATGGGATTTGGGTTACCAGCTGCAGCTGGCGCTGCT GTGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGTGATGGTAGTTTCCTCA TGAACATTCAGGAGTTGGCGTTGATCCGCATTGAGAACCTCCCAGTGAAGGTGAT GATATTGAACAACCAGCATCTGGGAATGGTGGTGCAGTGGGAGGATAGGTTTTA CAAGGCCAATCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGAT ATATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCAGCAGTTCGAGTG ACGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGG CCATACTTGTTGGATATCATAGTCCCGCATCAGGAGCACGTGCTGCCTATGATTC CCAATGGCGGCGCTTTCAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGT ACTGAAAATGGTCCGAAGGTCCACGCCGCCACCTCGTACTGAAATGGTCCRAAG GTCCACGCCGCCAACTACGAGTATGATCCCAAGCGGTGGTGCTTTCAAGGACATG ATCATGGAGGGTGATGGCAGGACCTCGTACTGARATTTCGACCTACAAGACCTAC AAGTGTGACATGCGCAATGAGCATGATGCCCGCGTGTTGTATCAACTACTAGGGG TTCAACTGTGAGCCATGCGTTTTCTAGTTTGCTTGTTTCATTCATATAAGCTTGTAT TACTTAGTTCCGAACCCTGTAGTTTTGTAGTCTATGTTCTCTTTTGTAGGGATGTG CTGTCATAAGATGTCATGCAAGTTTCTTGTCCTACATATC

[0503] SEQ ID NO:218

[0504] SEQ ID NO:219

ACCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC

TTCGTCTGGTTGGGGGCAATGGGATTTGGGTTACCAGCTGCAGCTGGCGCTGCTG

TGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGAGATGGTAGTTTCCTCAT GAACATTCAGGAGTTGGCATTGATCCGTATTGAGAACCTCCCTGTGAAGGTGATG ATATTGAACAACCAGCATCTGGGAATGGTGGTGCAATGGGAGGATAGGTTTTAC AAGGCCAATCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGATA TATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCGGCAGTTCGTGTGA CGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGGC CATACTTGTTGGATATCATCGTCCCGCATCAGGAGCACGTGCTGCCTATGATCCC AAGCGGTGGTGCTTTCAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGTA CTGAAATTTCGACCTACAAGACCTACAAGTGTGACATGCGCAATCAGCATGGTGC CCGCGTGTTGTATCAACTACTAGGGGTTCAACTGTGAACCATGCGTTTTCTAGTTT GCTTGTTTCATTCATATAAGCTTGTGTTACTTAGTTCCGAACCCTGTAGCTTTGTA GTCTATGCTCTCTTTTGTAGGGATGTGCTGTCATAAGATATCATGCAAGTTTCTTG TCCTACATATC

[0505] SEQ ID NO:220

ATCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC TTCATCCGGTTTGGGTGCAATGGGATTTGGGTTGCCAGCTGCAGCTGGCGCTGCT GTGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGGGATGGTAGTTTCCTCA TGAACATTCAGGAGTTGGCGTTGATCCGTATTGAGAACCTCCCAGTGAAGGTGAT GATATTGAACAACCAGCATCTGGGAATGGTGGTGCAGTGGGAGGATAGGTTTTA CAAGGCCAACCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGAT ATATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCGGCAGTTCGTGTG ACGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGG CCATACTTGTTGGATATCATTGTCCCGCATGCTTTTAAGGACATGATCATGGAGG GTGATGGCAGGACCTCGTACTGAAATTTCGACCTACAAGACCTACAAGTGTGACA TGCGCAATCAGCATGATACCTGCGTGTTGTATCAACTACTGGGGGTTCAACTGTG AACCATGCGTTTTCTAGTTTGCTTGTTTCATTCATATAAGCTTGTGTTACTTAGTTC CGAACCGTGTAGTTTTGTAGTCTCTGTTCTCTTTTGTAGGGATGTGCTGTCATAAG ATATCATGCAAGTTTCTTGTCCTACATATC

[0506] SEQ ID NO:221

[0507] SEQ ID NO:222

[0508] SEQ ID NO:223

[0509] SEQ lD O:224

[0510] SEQ ID NO:225

ACCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC

TTCGTCTGGTTGGGGGCAATGGGATTTGGGTTACCAGCTGCAGCTGGCGCTGCTG

[0511] SEQ DD NO:226

ATCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC TTCATCCGGTTTGGGTGCAATGGGATTTGGGTTGCCAGCTGCAGCTGGCGCTGCT GTGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGGGATGGTAGTTTCCTCA TGAACATTCAGGAGTTGGCGTTGATCCGTATTGAGAACCTCCCAGTGAAGGTGAT GATATTGAACAACCAGCATCTGGGAATGGTGGTGCAGTGGGAGGATAGGTTTTA CAAGGCCAACCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGAT ATATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCGGCAGTTCGTGTG ACGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGG CCATACTTGTTGGATATCATTGTCCCGCATCAGGAGCACGTGCTGCCTATGATtCC cAatGGcGGcGCTTTCAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGTACT GAAATGGTCCGAAGGTCCACGCCGCCAACTACGAGTATGATCCCAAGCGGTGGT GCTTTTAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGTACTGAAATTTCG ACCTACAAGACCTACAAGTGTGACATGCGCAATCAGCATGATACCTGCGTGTTGT ATCAACTACTGGGGGTTCAACTGTGAACCATGCGTTTTCTAGTTTGCTTGTTTCAT TCATATAAGCTTGTGTTACTTAGTTCCGAACCGTGTAGTTTTGTAGTCTCTGTTCT CTTTTGTAGGGATGTGCTGTCATAAGATATCATGCAAGTTTCTTGTCCTACATATC

[0512] SEQ ID NO:227

ATCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC

TTCATCCGGTTTGGGTGCAATGGGATTTGGGTTGCCAGCTGCAGCTGGCGCTGCT

[0513] SEQ ID NO.228

ACCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC TTCGTCTGGTTTGGGGGCAATGGGATTTGGGTTACCAGCTGCAGCTGGCGCTGCT GTGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGTGATGGTAGTTTCCTCA TGAACATTCAGGAGTTGGCGTTGATCCGCATTGAGAACCTCCCAGTGAAGGTGAT GATATTGAACAACCAGCATCTGGGAATGGTGGTGCAGTGGGAGGATAGGTTTTA CAAGGCCAATCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGAT ATATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCAGCAGTTCGAGTG ACGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGG CCATACTTGTTGGATATCATCGTCCCGCATCAGGAGCACGTGCTGCCTATGATCC CAAGCGGTGGTGCTTTCAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGT ACTGAAATTTCGACCTACAAGACCTACAAGTGTGACATGCGCAATCAGCATGGTG CCCGCGTGTTGTATCAACTACTAGGGGTTCAACTGTGAGCCATGCGTTTTCTAGTT TGCTTGTTTCATTCATATAAGCTTGTATTACTTAGTTCCGAACCCTGTAGTTTTGTA GTCTATGTTCTCTTTTGTAGGGATGTGCTGTCATAAGATGTCATGCAAGTTTCTTG TCCTACATATC

[0514] SEQ ID NO:229

ACCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC

TTCGTCTGGTTTGGGGGCAATGGGATTTGGGTTACCAGCTGCAGCTGGCGCTGCT

GTGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGTGATGGTAGTTTCCTCA TGAACATTCAGGAGTTGGCGTTGATCCGCATTGAGAACCTCCCAGTGAAGGTGAT GATATTGAACAACCAGCATCTGGGAATGGTGGTGCAGTGGGAGGATAGGTTTTA CAAGGCCAATCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGAT ATATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCAGCAGTTCGAGTG ACGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGG CCATACTTGTTGGATATCATCGTCCCGCATCAGGAGCACGTGCTGCCTATGATCC CAAGCGGTGGTGCTTTCAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGT ACTGAAATTTCGACCTACAAGACCTACAAGTGTGACATGCGCAATCAGCATGGTG CCCGCGTGTTGTATCAACTACTAGGGGTTCAACTGTGAGCCATGCGTTTTCTAGTT TGCTTGTTTCATTCATATAAGCTTGTATTACTTAGTTCCGAACCCTGTAGTTTTGTA GTCTATGTTCTCTTTTGTAGGGATGTGCTGTCATAAGATGTCATGCAAGTTTCTTG TCCTACATATC

[0515] SEQ ID NO:230

ACCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC TTCGTCTGGTTGGGGGCAATGGGATTTGGGTTACCAGCTGCAGCTGGCGCTGCTG TGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGAGATGGTAGTTTCCTCAT GAACATTCAGGAGTTGGCATTGATCCGTATTGAGAACCTCCCTGTGAAGGTGATG ATATTGAACAACCAGCATCTGGGAATGGTGGTGCAATGGGAGGATAGGTTTTAC AAGGCCAATCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGATA TATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCGGCAGTTCGTGTGA CGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGGC CATACTTGTTGGATATCATCGTCCCGCATCAGGAGCACGTGCTGCCTATGATtCCc AatGGcGGcGCTTTCAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGTACTG AAATGGTCCGAAGGTCCACGCCGCCAACTACGAGTATGATCCCAAGCGGTGGTG CTTTCAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGTACTGAAATTTCGA CCTACAAGACCTACAAGTGTGACATGCGCAATCAGCATGGTGCCCGCGTGTTGTA TCAACTACTAGGGGTTCAACTGTGAACCATGCGTTTTCTAGTTTGCTTGTTTCATT CATATAAGCTTGTGTTACTTAGTTCCGAACCCTGTAGCTTTGTAGTCTATGCTCTC TTTTGTAGGGATGTGCTGTCATAAGATATCATGCAAGTTTCTTGTCCTACATATC [0516] SEQ ID NO:231

[0517] SEQ ID NO:232

[0518] SEQ ID NO:233

[0519] SEQ ID NO:234

[0520] SEQ ID O:235

[0521] SEQ ID NO:236

[0522] SEQ ID NO:237

[0523] SEQ ID NO:238

[0524] SEQ ID NO:239

ATCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCC ACGGCAGTGGCTGTC TTCATCCGGTTTGGGTGCAATGGGATTTGGGTTGCCAGCTGCAGCTGGCGCTGCT GTGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGGGATGGTAGTTTCCTCA TGAACATTCAGGAGTTGGCGTTGATCCGTATTGAGAACCTCCCAGTGAAGGTGAT GATATTGAACAACCAGCATCTGGGAATGGTGGTGCAGTGGGAGGATAGGTTTTA CAAGGCCAACCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGAT ATATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCGGCAGTTCGTGTG ACGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGG CCATACTTGTTGGATATCATTGTCCCGCATCAGGAGCACGTGCTGCCTATGATCCC AAGCGGTGGTGCTTTTAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGTA CTGAAATTTCGACCTACAAGACCTACAAGTGTGACATGCGCAATCAGCATGATAC CTGCGTGTTGTATCAACTACTGGGGGTTCAACTGTGAACCATGCGTTTTCTAGTTT GCTTGTTTCATTCATATAAGCTTGTGTTACTTAGTTCCGAACCGTGTAGTTTTGTA GTCTCTGTTCTCTTTTGTAGGGATGTGCTGTCATAAGATATCATGCAAGTTTCTTG TCCTACATATC

[0525] SEQ ID NO:240

[0526] SEQ ID NO:241

[0527] SEQ ID NO:242

ACCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC

TTCGTCTGGTTGGGGGCAATGGGATTTGGGTTACCAGCTGCAGCTGGCGCTGCTG

TGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGAGATGGTAGTTTCCTCAT GAACATTC AGGAGTTGGC ATTGATCCGTATTGAGAACCTCCCTGTGAAGGTGATG ATATTGAACAACCAGCATCTGGGAATGGTGGTGCAATGGGAGGATAGGTTTTAC AAGGCCAATCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGATA TATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCGGCAGTTCGTGTGA CGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGGC CATACTTGTTGGATATCATCGTCCCGCATCAGGAGCACGTGCTGCCTATGATtCCc AatGGcGGcGCTTTCAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGTACTG AAATGGTCCGAAGGTCCACGCCGCCAACTACGAGTATGATCCCAAGCGGTGGTG CTTTCAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGTACTGAAATTTCGA CCTACAAGACCTACAAGTGTGACATGCGCAATCAGCATGGTGCCCGCGTGTTGTA TCAACTACTAGGGGTTCAACTGTGAACCATGCGTTTTCTAGTTTGCTTGTTTCATT CATATAAGCTTGTGTTACTTAGTTCCGAACCCTGTAGCTTTGTAGTCTATGCTCTC TTTTGTAGGGATGTGCTGTCATAAGATATCATGCAAGTTTCTTGTCCTACATATC

[0528] SEQ ID NO:243

[0529] SEQ ED NO:244

ATCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC TTCATCCGGTTTGGGTGCAATGGGATTTGGGTTGCCAGCTGCAGCTGGCGCTGCT GTGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGGGATGGTAGTTTCCTCA TGAACATTCAGGAGTTGGCGTTGATCCGTATTGAGAACCTCCCAGTGAAGGTGAT GATATTGAACAACCAGCATCTGGGAATGGTGGTGCAGTGGGAGGATAGGTTTTA CAAGGCCAACCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGAT ATATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCGGCAGTTCGTGTG ACGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGG CCATACTTGTTGGATATCATTGTCCCGCATCAGGAGCACGTGCTGCCtGcGCTTTC AAGGACATGATCATGGAGGGTGATGGCAGGACCTCGTACTGAAATGGTCCGAAG GTCCACGCCGCCAACTACGAGTATGATCCCAAGCGGTGGTGCTTTTAAGGACATG ATCATGGAGGGTGATGGCAGGACCTCGTACTGAAATTTCGACCTACAAGACCTAC AAGTGTGACATGCGCAATCAGCATGATACCTGCGTGTTGTATCAACTACTGGGGG TTCAACTGTGAACCATGCGTTTTCTAGTTTGCTTGTTTCATTCATATAAGCTTGTGT TACTTAGTTCCGAACCGTGTAGTTTTGTAGTCTCTGTTCTCTTTTGTAGGGATGTG CTGTCATAAGATATCATGCAAGTTTCTTGTCCTACATATC

[0530] SEQ ID NO:245

ATCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC

TTCATCCGGTTTGGGTGCAATGGGATTTGGGTTGCCAGCTGCAGCTGGCGCTGCT

[0531] SEQ ID NO:246

[0532] SEQ ID NO:247

ACCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC

TTCGTCTGGTTTGGGGGCAATGGGATTTGGGTTACCAGCTGCAGCTGGCGCTGCT

GTGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGTGATGGTAGTTTCCTCA TGAACATTCAGGAGTTGGCGTTGATCCGCATTGAGAACCTCCCAGTGAAGGTGAT GATATTGAACAACCAGCATCTGGGAATGGTGGTGCAGTGGGAGGATAGGTTTTA CAAGGCCAATCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGAT ATATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCAGCAGTTCGAGTG ACGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGG CCATACTTGTTGGATATCATAGTCCCGCATCAGGAGCACGTGCTGCCTATGATCC CAAGCGGTGGTGCTTTCAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGT ACTGAAATTTCGACCTACAAGACCTACAAGTGTGACATGCGCAATCAGCATGATG CCCGCGTGTTGTATCAACTACTAGGGGTTCAACTGTGAGCCATGCGTTTTCTAGTT TGCTTGTTTCATTCATATAAGCTTGTATTACTTAGTTCCGAACCCTGTAGTTTTGTA GTCTATGTTCTCTTTTGTAGGGATGTGCTGTCATAAGATGTCATGCAAGTTTCTTG TCCTACATATC

[0533] SEQ D NO:248

[0534] SEQ ID NO:249

ACCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC

TTCGTCTGGTTGGGGGCAATGGGATTTGGGTTACCAGCTGCAGCTGGCGCTGCTG

[0535] SEQ ID NO:250

[0536] SEQ ID NO:251

[0537] SEQ ID NO:252

ACCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC TTCGTCTGGTTTGGGGGCAATGGGATTTGGGTTACCAGCTGCAGCTGGCGCTGCT GTGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGTGATGGTAGTTTCCTCA TGAACATTCAGGAGTTGGCGTTGATCCGCATTGAGAACCTCCCAGTGAAGGTGAT GATATTGAACAACCAGCATCTGGGAATGGTGGTGCAGTGGGAGGATAGGTTTTA CAAGGCCAATCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGAT ATATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCAGCAGTTCGAGTG ACGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGG CCATACTTGTTGGATATCATAGTCCCGCATCAGGAGCACGTGCTGCCTATGATCC CAAGCGGTGGTGCTTTCAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGT ACTGAAATTTCGACCTACAAGACCTACAAGTGTGACATGCGCAATCAGCATGATG CCCGCGTGTTGTATCAACTACTAGGGGTTCAACTGTGAGCCATGCGTTTTCTAGTT TGCTTGTTTCATTCATATAAGCTTGTATTACTTAGTTCCGAACCCTGTAGTTTTGTA GTCTATGTTCTCTTTTGTAGGGATGTGCTGTCATAAGATGTCATGCAAGTTTCTTG TCCTACATATC [0538] SEQ ID NO:253

[0539] SEQ ID NO:254

[0540] SEQ ID NO:255

ACCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC

TTCGTCTGGTTGGGGGCAATGGGATTTGGGTTACCAGCTGCAGCTGGCGCTGCTG

TGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGAGATGGTAGTTTCCTCAT GAACATTCAGGAGTTGGCATTGATCCGTATTGAGAACCTCCCTGTGAAGGTGATG ATATTGAACAACCAGCATCTGGGAATGGTGGTGCAATGGGAGGATAGGTTTTAC AAGGCCAATCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGATA TATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCGGCAGTTCGTGTGA CGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGGC CATACTTGTTGGATATCATCGTCCCGCATCAGGAGCACGTGCTGCCTATGATCCC AAGCGGTGGTGCTTTCAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGTA CTGAAATTTCGACCTACAAGACCTACAAGTGTGACATGCGCAATCAGCATGGTGC CCGCGTGTTGTATCAACTACTAGGGGTTCAACTGTGAACCATGCGTTTTCTAGTTT GCTTGTTTCATTCATATAAGCTTGTGTTACTTAGTTCCGAACCCTGTAGCTTTGTA GTCTATGCTCTCTTTTGTAGGGATGTGCTGTCATAAGATATCATGCAAGTTTCTTG TCCTACATATC [0541] SEQ ID NO:256

[0542] SEQ ID NO:257

[0543] SEQ ID NO:258

ACCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC TTCGTCTGGTTTGGGGGCAATGGGATTTGGGTTACCAGCTGCAGCTGGCGCTGCT GTGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGTGATGGTAGTTTCCTCA TGAACATTCAGGAGTTGGCGTTGATCCGCATTGAGAACCTCCCAGTGAAGGTGAT GATATTGAACAACCAGCATCTGGGAATGGTGGTGCAGTGGGAGGATAGGTTTTA CAAGGCCAATCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGAT ATATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCAGCAGTTCGAGTG ACGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGG CCATACTTGTTGGATATCATAGTCCCGCATCAGGAGCACGTGCTGCCTATGATtCC cAatGGcGGcGCTTTCAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGTACT GAAATGGTCCGAAGGTCCACGCCGCCAACTACGAGTATGATCCCAAGCGGTGGT GCTTTCAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGTACTGAAATTTCG ACCTACAAGACCTACAAGTGTGACATGCGCAATCAGCATGATGCCCGCGTGTTGT ATCAACTACTAGGGGTTCAACTGTGAGCCATGCGTTTTCTAGTTTGCTTGTTTCAT TCATATAAGCTTGTATTACTTAGTTCCGAACCCTGTAGTTTTGTAGTCTATGTTCT CTTTTGTAGGGATGTGCTGTCATAAGATGTCATGCAAGTTTCTTGTCCTACATATC [0544] SEQ ID NO:259

[0545] SEQ ID NO:260

[0546] SEQ ID NO:261

[0547] SEQ ID NO:262

[0548] SEQ ID NO:263

[0549] SEQ ID NO:264

ACCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC TTCGTCTGGTTTGGGGGCAATGGGATTTGGGTTACCAGCTGCAGCTGGCGCTGCT GTGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGTGATGGTAGTTTCCTCA TGAACATTCAGGAGTTGGCGTTGATCCGCATTGAGAACCTCCCAGTGAAGGTGAT GATATTGAACAACCAGCATCTGGGAATGGTGGTGCAGTGGGAGGATAGGTTTTA CAAGGCCAATCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGAT ATATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCAGCAGTTCGAGTG ACGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGG CCATACTTGTTGGATATCATAGTCCCGCATCAGGAGCACGTGTCCCAAGCGGTGG TGCTTTCAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGTACTGAAATTTC GACCTACAAGACCTACAAGTGTGACATGCGCAATCAGCATGATGCCCGCGTGTTG TATCAACTACTAGGGGTTCAACTGTGAGCCATGCGTTTTCTAGTTTGCTTGTTTCA TTCATATAAGCTTGTATTACTTAGTTCCGAACCCTGTAGTTTTGTAGTCTATGTTCT CTTTTGTAGGGATGTGCTGTCATAAGATGTCATGCAAGTTTCTTGTCCTACATATC

[0550] SEQ ID NO:265

ACCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC TTCGTCTGGTTTGGGGGCAATGGGATTTGGGTTACCAGCTGCAGCTGGCGCTGCT GTGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGTGATGGTAGTTTCCTCA TGAACATTCAGGAGTTGGCGTTGATCCGCATTGAGAACCTCCCAGTGAAGGTGAT GATATTGAACAACCAGCATCTGGGAATGGTGGTGCAGTGGGAGGATAGGTTTTA CAAGGCCAATCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGAT ATATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCAGCAGTTCGAGTG ACGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGG CCATACTTGTTGGATATCATAGTCCCGCATCAGGAGCACGTGCTGCCTATGATtCC cAatGGcGGcGCTTTCAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGTACT GAAATGGTCCGAAGGTCAAGCGGTGGTGCTTTCAAGGACATGATCATGGAGGGT GATGGCAGGACCTCGTACTGAAATTTCGACCTACAAGACCTACAAGTGTGACATG CGCAATCAGCATGATGCCCGCGTGTTGTATCAACTACTAGGGGTTCAACTGTGAG CCATGCGTTTTCTAGTTTGCTTGTTTCATTCATATAAGCTTGTATTACTTAGTTCCG AACCCTGTAGTTTTGTAGTCTATGTTCTCTTTTGTAGGGATGTGCTGTCATAAGAT GTCATGCAAGTTTCTTGTCCTACATATC

[0551] SEQ ID NO:266

ACCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC TTCGTCTGGTTGGGGGCAATGGGATTTGGGTTACCAGCTGCAGCTGGCGCTGCTG TGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGAGATGGTAGTTTCCTCAT GAACATTCAGGAGTTGGCATTGATCCGTATTGAGAACCTCCCTGTGAAGGTGATG ATATTGAACAACCAGCATCTGGGAATGGTGGTGCAATGGGAGGATAGGTTTTAC AAGGCCAATCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGATA TATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCGGCAGTTCGTGTGA CGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGGC CATACTTGTTGGATATCATCGTCCCGCATCAGGAGCACGTGCTGCCTATGATtCCc AatGGcGGcGCTTTCAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGTACTG AAATTTgcAgGGTGGTGCTTTCAAGGACATGATCATGGAGGGTGATGGCAGGACC TCGTACTGAAATTTCGACCTACAAGACCTACAAGTGTGACATGCGCAATCAGCAT GGTGCCCGCGTGTTGTATCAACTACTAGGGGTTCAACTGTGAACCATGCGTTTTCT AGTTTGCTTGTTTCATTCATATAAGCTTGTGTTACTTAGTTCCGAACCCTGTAGCT TTGTAGTCTATGCTCTCTTTTGTAGGGATGTGCTGTCATAAGATATCATGCAAGTT TCTTGTCCTACATATC

[0552] SEQ ID NO:267

[0553] SEQ ID NO:268

[0554] SEQ ID NO:269

[0555] SEQ ID NO:270

ACCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC TTCGTCTGGTTTGGGGGCAATGGGATTTGGGTTACCAGCTGCAGCTGGCGCTGCT GTGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGTGATGGTAGTTTCCTCA TGAACATTCAGGAGTTGGCGTTGATCCGCATTGAGAACCTCCCAGTGAAGGTGAT GATATTGAACAACCAGCATCTGGGAATGGTGGTGCAGTGGGAGGATAGGTTTTA CAAGGCCAATCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGAT ATATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCAGCAGTTCGAGTG ACGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGG CCATACTTGTTGGATATCATAGTCCCGCATCAGGAGCACGTGCTGCCTATGATtCC cAatGGcGGcGCTTTCAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGTACT GAAATggtccgaaggtCAAGCGGTGGTGCTTTCAAGGACATGATCATGGAGGGTGATG GCAGGACCTCGTACTGAAATTTCGACCTACAAGACCTACAAGTGTGACATGCGCA ATCAGCATGATGCCCGCGTGTTGTATCAACTACTAGGGGTTCAACTGTGAGCCAT GCGTTTTCTAGTTTGCTTGTTTCATTCATATAAGCTTGTATTACTTAGTTCCGAACC CTGTAGTTTTGTAGTCTATGTTCTCTTTTGTAGGGATGTGCTGTCATAAGATGTCA TGCAAGTTTCTTGTCCTACATATC

[0556] SEQ ID NO:271

[0557] SEQ ED NO:272

ACCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC TTCGTCTGGTTGGGGGCAATGGGATTTGGGTTACCAGCTGCAGCTGGCGCTGCTG TGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGAGATGGTAGTTTCCTCAT GAACATTCAGGAGTTGGCATTGATCCGTATTGAGAACCTCCCTGTGAAGGTGATG ATATTGAACAACCAGCATCTGGGAATGGTGGTGCAATGGGAGGATAGGTTTTAC AAGGCCAATCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGATA TATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCGGCAGTTCGTGTGA CGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGGC CATACTTGTTGGATATCATCGTCCCGCATCAGGAGCACGTGCTGCCTATGATtCCc AatGGcGGcGCTTTCAAGGACATGATCATGGAGGGTGATGGCAGGACCTCGTACTG AAATTTgcAggTACAAGATCCCAAGCGGTGGTGCTTTCAAGGACATGATCATGGAG GGTGATGGCAGGACCTCGTACTGAAATTTCGACCTACAAGACCTACAAGTGTGAC ATGCGCAATCAGCATGGTGCCCGCGTGTTGTATCAACTACTAGGGGTTCAACTGT GAACCATGCGTTTTCTAGTTTGCTTGTTTCATTCATATAAGCTTGTGTTACTTAGTT CCGAACCCTGTAGCTTTGTAGTCTATGCTCTCTTTTGTAGGGATGTGCTGTCATAA GATATCATGCAAGTTTCTTGTCCTACATATC

[0558] SEQ ID NO:273

ACCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC

TTCGTCTGGTTGGGGGCAATGGGATTTGGGTTACCAGCTGCAGCTGGCGCTGCTG

[0559] SEQ ID NO:274

[0560] SEQ ID NO:275

ATCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC

TTCATCCGGTTTGGGTGCAATGGGATTTGGGTTGCCAGCTGCAGCTGGCGCTGCT

[0561] SEQ ID NO:276

[0562] SEQ ID NO:277

ACCAGATGTGGGCGGCTCAGTATTACACTTACAAGCGGCCACGGCAGTGGCTGTC TTCGTCTGGTTTGGGGGCAATGGGATTTGGGTTACCAGCTGCAGCTGGCGCTGCT GTGGCCAACCCAGGTGTTACAGTTGTTGACATTGATGGTGATGGTAGTTTCCTCA TGAACATTCAGGAGTTGGCGTTGATCCGCATTGAGAACCTCCCAGTGAAGGTGAT GATATTGAACAACCAGCATCTGGGAATGGTGGTGCAGTGGGAGGATAGGTTTTA CAAGGCCAATCGGGCGCACACATACCTTGGCAACCCAGAAAATGAGAGTGAGAT ATATCCAGATTTTGTGACGATTGCTAAAGGATTCAACGTTCCAGCAGTTCGAGTG ACGAAGAAGAGCGAAGTCACTGCAGCAATCAAGAAGATGCTTGAGACCCCAGGG CCATACTTGTTGGATATCATAGTCCCGCATCAGGAGCACGTGCTGCCTATGATCC CAAGCGGTGGTGCTTTCAAGGACATG ATCATGG AGGGTGATGGCAGGACCTCGT ACTGAAATTTCGACCTACAAGACCTACAAGTGTGACATGCGCAATCAGCATGATG CCCGCGTGTTGTATCAACTACTAGGGGTTCAACTGTGAGCCATGCGTTTTCTAGTT TGCTTGTTTCATTCATATAAGCTTGTATTACTTAGTTCCGAACCCTGTAGTTTTGTA GTCTATGTTCTCTTTTGTAGGGATGTGCTGTCATAAGATGTCATGCAAGTTTCTTG TCCTACATATC

[0563] SEQ ID NO:299

TCCAAGGTTGCGGCCGCGCCCAAGGAACCCTTTTCTGGGCCAT: : : : :TACTCGGCCA CGACTGGTAATTTAATTTTCAATTTATTT

[0564] SEQ ID NO: 326

tggcaggatatattgtggtgtaaacaaattgacgcttagacaacttaataacacattgcgga cgtttttaatgtactgaattaacgccgaattgaattcgagctcggtaccactggattttggt tttaggaattagaaattttattgatagaagtattttacaaatacaaatacatactaagggtt tcttatatgctcaacacatgagcgaaaccctataagaaccctaattcccttatctgggaact actcacacattattctggagaaaaatagagagagatagatttgtagagagagactggtgatt tttgcggactctattagatctgggtaactggcctaactggccttggaggagctggcaactca aaatccctttgccaaaaaccaacatcatgccatccaccatgcttgtatccagctgcgcgcaa tgtaccccgggctgtgtatcccaaagcctcatgcaacctaacagatggatcgtttggaaggc ctataacagcaaccacagacttaaaaccttgcgcctccatagacttaagcaaatgtgtgtac aatgtggatcctaggcccaacctttgatgcctatgtgacacgtaaacagtactctcaactgt ccaatcgtaagcgttcctagccttccagggcccagcgtaagcaataccagccacaacaccct caacctcagcaaccaaccaagggtatctatcttgcaacctctcgagatcatcaatccactct tgtggtgtttgtggctctgtcctaaagttcactgtagacgtctcaatgtaatggttaacgat atcacaaaccgcggccatatcagctgctgtagctggcctaatctcaactggtctcctctccg gagacatggcttctacctacaaaaaagctccgcacgaggctgcatttgtcacaaatcatgaa aagaaaaactaccgatgaacaatgctgagggattcaaattctacccacaaaaagaagaaaga aagatctagcacatctaagcctgacgaagcagcagaaatatataaaaatataaaccatagtg cccttttcccctcttcctgatcttgtttagcatggcggaaattttaaaccccccatcatctc ccccaacaacggcggatcgcagatctacatccgagagccccattccccgcgagatccgggcc ggatccacgccggcgagagccccagccgcgagatcccgcccctcccgcgcaccgatctgggc gcgcacgaagccgcctctcgcccacccaaactaccaaggccaaagatcgagaccgagacgga aaaaaaaaacggagaaagaaagaggagaggggcggggtggttaccggcgcggcggcggcgga gggggaggggggaggagctcgtcgtccggcagcgaggggggaggaggtggaggtggtggtgg tggtggtggtagggttggggggatgggaggagaggggggggtatgtatatagtggcgatggg gggcgtttctttggaagcggagggagggccggcctcgtcgctggctcgcgatcctcctcgcg tttccggcccccacgacccggacccacctgctgttttttctttttcttttttttctttcttt ttttttttttggctgcgagacgtgcggtgcgtgcggacaactcacggtgatagtgggggggt gtggagactattgtccagttggctggactggggtgggttgggttgggttgggttgggctggg cttgctatggatcgtggatagcactttgggctttaggaactttaggggttgtttttgtaaat gttttgagtctaagtttatcttttatttttactagaaaaaatacccatgcgctgcaacgggg gaaagctattttaatcttattattgttcattgtgagaattcgcctgaatatatatttttctc aaaaattatgtcaaattagcatatgggtttttttaaagatatttcttatacaaatccctctg tatttacaaaagcaaacgaacttaaaacccgactcaaatacagatatgcatttccaaaagcg aataaacttaaaaaccaattcatacaaaaatgacgtatcaaagtaccgacaaaaacatcctc aatttttataatagtagaaaagagtaaatttcactttgggccaccttttattaccgatattt tactttataccaccttttaactgatgttttcacttttgaccaggtaatcttacctttgtttt attttggactatcccgactctcttctcaagcatatgaatgacctcgagtatgctagtctaga gtcgacctgcagggtgcagcgtgacccggtcgtgcccctctctagagataatgagcattgca tgtctaagttataaaaaattaccacatattttttttgtcacacttgtttgaagtgcagttta tctatctttatacatatatttaaactttactctacgaataatataatctatagtactacaat aatatcagtgttttagagaatcatataaatgaacagttagacatggtctaaaggacaattga gtattttgacaacaggactctacagttttatctttttagtgtgcatgtgttctccttttttt ttgcaaatagcttcacctatataatacttcatccattttattagtacatccatttagggttt agggttaatggtttttatagactaatttttttagtacatctattttattctattttagcctc taaattaagaaaactaaaactctattttagtttttttatttaataatttagatataaaatag aataaaataaagtgactaaaaattaaacaaataccctttaagaaattaaaaaaactaaggaa acatttttcttgtttcgagtagataatgccagcctgttaaacgccgtcgacgagtctaacgg acaccaaccagcgaaccagcagcgtcgcgtcgggccaagcgaagcagacggcacggcatctc tgtcgctgcctctggacccctctcgagagttccgctccaccgttggacttgctccgctgtcg gcatccagaaattgcgtggcggagcggcagacgtgagccggcacggcaggcggcctcctcct cctctcacggcacggcagctacgggggattcctttcccaccgctccttcgctttcccttcct cgcccgccgtaataaatagacaccccctccacaccctctttccccaacctcgtgttgttcgg agcgcacacacacacaaccagatctcccccaaatccacccgtcggcacctccgcttcaaggt acgccgctcgtcctccccccccccccctctctaccttctctagatcggcgttccggtccatg gttagggcccggtagttctacttctgttcatgtttgtgttagatccgtgtttgtgttagatc cgtgctgctagcgttcgtacacggatgcgacctgtacgtcagacacgttctgattgctaact tgccagtgtttctctttggggaatcctgggatggctctagccgttccgcagacgggatcgat ttcatgattttttttgtttcgttgcatagggtttggtttgcccttttcctttatttcaatat atgccgtgcacttgtttgtcgggtcatcttttcatgcttttttttgtcttggttgtgatgat gtggtctggttgggcggtcgttctagatcggagtagaattctgtttcaaactacctggtgga tttattaattttggatctgtatgtgtgtgccatacatattcatagttacgaattgaagatga ^■ tggatggaaatatcgatctaggataggtatacatgttgatgcgggttttactgatgcatata cagagatgctttttgttcgcttggttgtgatgatgtggtgtggttgggcggtcgttcattcg ttctagatcggagtagaatactgtttcaaactacctggtgtatttattaattttggaactgt atgtgtgtgtcatacatcttcatagttacgagtttaagatggatggaaatatcgatctagga taggtatacatgttgatgtgggttttactgatgcatatacatgatggcatatgcagcatcta ttcatatgctctaaccttgagtacctatctattataataaacaagtatgttttataattatt ttgatcttgatatacttggatgatggcatatgcagcagctatatgtggatttttttagccct gccttcatacgctatttatttgcttggtactgtttcttttgtcgatgctcaccctgttgttt ggtgttacttctgcaggaggatcacaagtttgtacaaaaaagcaggctatggccgccgccac ctcccccgccgtcgcattctcgggcgccaccgccgccgccatgcccaaacccgcccgccatc ctctcccgcgccaccagcccgtctcgcgccgcgcgctccccgcccgcgtcgtcaggtgttgc gccgcgtcccccgccgccacctccgccgcgcctcccgcaaccgcgctccggccctggggccc gtccgagccccgcaagggcgccgacatcctcgtcgaggcgctcgagcgctgcggcatcgtcg acgtcttcgcctaccccggcggcgcctccatggagatccaccaggcgctgacgcgctcgccc gtcatcaccaaccacctcttccgccacgagcagggggaggcgttcgcggcgtccggctacgc ccgcgcgtccggccgcgtcggcgtctgcgtcgccacctccggcccgggggccaccaacctcg tctccgcgctcgccgacgccctcctcgactccatccccatggtcgccatcacgggccaggtc tcccgccgcatgatcggcacggacgcgttccaggagacgcccatagtggaggtcacgcgctc catcaccaagcacaactacctggtccttgacgtggaggatatcccccgcgtcatccaggaag ccttcttccttgcatcctctggccgcccggggccggtgctagttgatatccccaaggacatc cagcagcagatggctgtgcccgtctgggacactccaatgagtttgccagggtacatcgcccg cctgcccaagccaccatctactgaatcgcttgagcaggtcctgcgtctggttggcgagtcac ggcgcccaattctgtatgttggtggtggctgcgctgcgtctggcgaggagttgcgccgcttt gttgagcttactgggattccagttacaactactctgatgggccttggcaacttccccagcga cgacccactgtctctgcgcatgcttgggatgcatggcactgtgtatgcaaattatgcagtag ataaggctgacctgttgctcgcatttggtgtgcggtttgatgatcgtgtgactgggaaaatc gaggcttttgcaagcaggtccaagattgtgcacattgacattgacccagctgagattggcaa gaacaagcagccacatgtctccatttgtgcagatgttaagcttgctttacaggggttgaatg atctattaaatgggagcaaagcacaacagggtctggattttggtccatggcacaaggagttg gatcagcagaagagggagtttcctctaggattcaagacttttggcgaggccatcccgccgca atatgctatccaggtactggatgagctgacaaaaggggaggcgatcattgccactggtgttg ggcagcaccagatgtgggcggctcagtattacacttacaagcggccacggcagtggctgtct tcgtctggtttgggggcaatgggatttgggttaccagctgcagctggcgctgctgtggccaa cccaggtgttacagttgttgacattgatggtgatggtagtttcctcatgaacattcaggagt tggcgttgatccgcattgagaacctcccagtgaaggtgatgatattgaacaaccagcatctg ggaatggtggtgcagtgggaggataggttttacaaggccaatcgggcgcacacataccttgg caacccagaaaatgagagtgagatatatccagattttgtgacgattgctaaaggattcaacg ttccagcagttcgagtgacgaagaagagcgaagtcactgcagcaatcaagaagatgcttgag accccagggccatacttgttggatatcatagtcccgcatcaggagcacgtgctgcctatgat cccaagcggtggtgctttcaaggacatgatcatggagggtgatggcaggacctcgtactgat acccagctttcttgtacaaagtggtgatcctactagtagaaggagtgcgtcgaagcagatcg ttcaaacatttggcaataaagtttcttaagattgaatcctgttgccggtcttgcgatgatta tcatataatttctgttgaattacgttaagcatgtaataattaacatgtaatgcatgacgtta tttatgagatgggtttttatgattagagtcccgcaattatacatttaatacgcgatagaaaa caaaatatagcgcgcaaactaggataaattatcgcgcgcggtgtcatctatgttactagatc gaaagcttagcttgagcttggatcagattgtcgtttcccgccttcagtttaaactatcagtg tttgacaggatatattggcgggtaaac

[0565] SEQIDNO:327

gcgaagatcc aggacaagga

[0566] SEQ ID NO: 328

ctgcttaccg gcaaagatga g

[0567] SEQ ID NO: 329

ttcccccgga ccagcagcgt

[0568] SEQ ID NO: 330

ccgacgagaa agaccagcaa

[0569] SEQ ID NO: 331

cttaagttgt cgatcgggac tgt

[0570] SEQ ID NO: 332

tgagcctctc gtcgccgatc acat

[0571] SEQ ID NO: 333

ccactcttgccctacacgacactgaagaccttatgattccaaacggcggcgccttcaaggac atgatcatggagggtgatggcaggacctcgtactgaaatttcgacctacaagacctacaagt gtgacatgcgcaatcagcatggtgcccgcgtgttgtatcaactactaggggttcaactgtga accatgcgttttctagtttgcttgtttcattcatataagcttgtgttacttagttccgaacc ctgtagctttgtagtctatgctctcttttgtagggatgtgctgtcataagatatcatgcaag tttcttgtcctacatatcaataataagtacttccatggaataattctcagttctgttttgaa ttttgcatcttctcacaaacagtgtgctggttcctttctgttcgctgacgccctcctcgact ccatccccatggtcgccatcacgggccaggtcccccgccgcatgatcggtagcgacttcgtg ggcgaggaaagcctttcgtccaaggtggtccctcctcgcaatcttgttggatggtgaatatt ataaaagcctgcccttctcgcgggtaagactcccgcccatccaggatgaggatgaccagcct tttgcagtttatccactagggacaggattgcatcctgccgaaaccctgccaagcttgaggta gcctccaatttgacggtgccgccagcgacgccgtctggaactgtcctttttgaggaccactc cgtttgtctagaggtacctggagatcatgacattaaggatgaccagttcgtaaaggtcctgc ggtgtctattgcttttcataggttaataagtgtttgctagactgtggtgaaaggccaagact cccgcccatctctctatgcccgggacaagtgccaccccacagtggggcaggatgaggatgac caaagactcccgcccatctcactagggacaggattggccttttgcagtttatctctatgccc gggacaagtgtatccgaagtaaataaaaccatcggactctcgtataagactgtcgactcgac cggccgacgcataggttcatttgaagctgctattctatttaaattgaaactcggacggtagc agtgtggtatgaggtcttcagcacactcggtaactccagtcac

[0572] SEQ ID NO: 334

ccactcttgccctacacgacactgaagacgtcgccattaccgggcaagtgacccgccgcatg atcggcacggacgcgttccaggagacgcccatagtggaggtcacgcgctccatcaccaagca caactacctggtccttgacgtggaggatatcccccgcgtcatccaggaagccttcttccttg catcctctggccgcccggggccggtgctagttgatatccccaaggacatccagcagcagatg gctgtgcccgtctgggacactccaatgagtttgccagggtacatcgcccgcctgcccaagcc accatctactgaatcgcttgagcaggtcctgcgtctggttggcgagtcacggcgcccaattc tgtatgttggtggtggctgcgctgcgtctggcgaggagttgcgccgctttgttgagcttact gggattccagttacaactactctgatgggccttggcaacttccccagcgacgacccactgtc tctgcgcatgcttgggatgcatggcactgtgtatgcaaattatgcagtagataaggctgacc tgttgctcgcatttggtgtgcggtttgatgatcgtgtgactgggaaaatcgaggcttttgca agcaggtccaagattgtgcacattgacattgacccagctgagattggcaagaacaagcagcc acatgtctccatttgtgcagatgttaagcttgctttacaggggttgaatgatctattaaatg ggagcaaagcacaacagggtctggattttggtccatggcacaaggagttggatcagcagaag agggagtttcctctaggattcaagacttttggcgaggccatcccgccgcaatatgctatcca ggtactggatgagctgacaaaaggggaggcgatcattgccactggtgttgggcagcaccaga tgtgggcggctcagtattacacttacaagcggccacggcagtggctgtcttcgtctggtttg ggggcaatgggatttgggttaccagctgcagctggcgctgctgtggccaacccaggtgttac agttgttgacattgatggtgatggtagtttcctcatgaacattcaggagttggcgttgatcc gcattgagaacctcccagtgaaggtgatgatattgaacaaccagcatctgggaatggtggtg cagtgggaggataggttttacaaggccaatcgggcgcacacataccttggcaacccagaaaa tgagagtgagatatatccagattttgtgacgattgctaaaggattcaacgttccagcagttc gagtgacgaagaagagcgaagtcactgcagcaatcaagaagatgcttgagaccccagggcca tacttgttggatatcatagtcccgcatcaggagcacgtgctgcctatgatcccaagcggtgg tgctttcaaggacatgatcatggagggtgatggcaggacctcgtactgaaatttcgacctac aagacctacaagtgtgacatgcgcaatcagcatggtgcccgcgtgttgtatcaactactagg ggttcaactgtgaaccatgcgttttctagtttgcttgtttcattcatataagcttgtgttac ttagttccgaaccctgtagctttgtagtctatgctctcttttgtagggatgtgctgtcataa gatatcatgcaagtttcttgtcctacatatcaataataagtacttccatggaataattctca gttctgttttgaattttgcatcttctcacaaacagtgtgctggttcctttctgttctacgcc cgcgcgtccggccgcgtcggcgtctgcgtcgccacctccggcccgggggccaccaacctcgt ctccgtagcgacttcgtgggcgaggaaagcctttcgtccaaggtggtccctcctcgcaatct tgttggatggtgaatattataaaagcctgcccttctcgcgggtgagtccatgctcaacaccg tgcactagggacaggattggccttttgcagtttatccactagggacaggattgcatcctgcc gaaaccctgccaagcttgaggtagcctccaatttgacggtgccgccagcgacgccgtctgga actgtcctttttgaggaccactccgtttgtctagaggtacctggagatcatgacattaagga tgaccagttcgtaaaggtcctgcggtgtctattgcttttcataggttaataagtgtttgcta gactgtggtgaaaggccgccttttgcagtttatctctagaaagactggagttgcagaaagac tcccgcccatccaggatgaggatgaccatatccgaagtaaataaaaccatcggactctcgta taagactgtcgactcgaccggccgacgcataggttcatttgaagctgctattctatttaaat tgaaactcggacggtagcagtgtggtatgaggtcttcagcacactcggtaactccagtcac [0573] SEQ DD NO: 335

tgagattggcaagaacaagcagccacatgtctccatttgtgcagatgttaagcttgctttac aggggttgaatgatctattaaatgggagcaaagcacaacagggtctggattttggtccatgg cacaaggagttggatcagcagaagagggagtttcctctaggattcaagacttttggcgaggc catcccgccgcaatatgctatccaggtactggatgagctgacaaaaggggaggcgatcattg ccactggtgttgggcagcaccagatgtgggcggctcagtattacacttacaagcggccacgg cagtggctgtcttcgtctggtttgggggcaatgggatttgggttaccagctgcagctggcgc tgctgtggccaacccaggtgttacagttgttgacattgatggtgatggtagtttcctcatga acattcaggagttggcgttgatccgcattgagaacctcccagtgaaggtgatgatattgaac aaccagcatctgggaatggtggtgcagtgggaggataggttttacaaggccaatcgggcgca cacataccttggcaacccagaaaatgagagtgagatatatccagattttgtgacgattgcta aaggattcaacgttccagcagttcgagtgacgaagaagagcgaagtcactgcagcaatcaag aagatgcttgagaccccagggccatacttgttggatatcatagtcccgcatcaggagcacgt gctgcctatgattccaaacggcggcgccttcaaggacatgatcatggagggtgatggcagga cctcgtactgaaatttcgacctacaagacctacaagtgtgacatgcgcaatcagcatggtgc ccgcgtgttgtatcaactactaggggttcaactgtgaaccatgcgttttctagtttgcttgt ttcattcatataagcttgtgttacttagttccgaaccctgtagctttgtagtctatgctctc ttttgtagggatgtgctgtcataagatatcatgcaagtttcttgtcctacatatcaataata agtacttccatggaataattctcagttctgttttgaattttgcatcttctcacaaacagtgt gctggttcctttctgttcgctgacgccctcctcgactccatccccatggtcgccatcacggg ccaggtcccccgccgcatgatcggtagcgacttcgtgggcgaggaaagcctttcgtccaagg tggtccctcctcgcaatcttgttggatggtgaatattataaaagcctgcccttctcgcgggt aagactcccgcccatccaggatgaggatgaccagccttttgcagtttatccactagggacag gattgcatcctgccgaaaccctgccaagcttgaggtagcctccaatttgacggtgccgccag cgacgccgtctggaactgtcctttttgaggaccactccgtttgtctagaggtacctggagat catgacattaaggatgaccagttcgtaaaggtcctgcggtgtctattgcttttcataggtta ataagtgtttgctagactgtggtgaaaggccaagactcccgcccatctctctatgcccggga caagtgccaccccacagtggggcaggatgaggatgaccaaagactcccgcccatctcactag ggacaggattggccttttgcagtttatctctatgcccgggacaagtgtatccgaagtaaata aaaccatcggactctcgtataagactgtcgactcgaccggccgacgcataggttcatttgaa gctgctattctatttaaattgaaatcccaagcggtggtgctttcaaggacatgatcatggag ggtgatggcaggacctcgtactgaaatttcgacctacaagacctacaagtgtgacatgcgca atcagcatgatgcccgcgtgttgtatcaactactaggggttcaactgtgagccatgcgtttt ctagtttgcttgtttcattcatataagcttgtattacttagttccgaaccctgtagttttgt agtctatgttctcttttgtagggatgtgctgtcataagatgtcatgcaagtttcttgtccta catatcaataataagtacttccatggaataattctcagttctgttttgaattttgcatcttc tcacaaacagtgtgctggttcctttctgttactttacatgtctgctgtgtcaggttctgaca taacgaccgatggagggtggtcggcaggttttagaaggggaattgaaacttttttttgggaa gaagtctgaatacagttgggaggaaaaatagaagtatatacttcgattaatttatcaagccc gctatccagtctaatttatcaagcactagacagtgtagggtgttggcattcttctcttcctt gagatccggcttgagaggagagaccgaggcttcggctgtgttggttgctgatttctacagct ttttgagatagagagagagatcctgcaactgtggtttgtcttgctgcttgtacagcgagaga gacattgagagatatgtagatcgtttacc

[0574] SEQE)NO:350

CCAGAAGGTAATTATCCAAGATGTAGCATCAAGAATCCAATGTTTACGGGAAAA ACTATGGAAGTATTATGTAAGCTCAGCAAGAAGCAGATCAATATGCGGCACATA TGCAACCTATGTTCAAAAATGAAGAATGTACAGATACAAGATCCTATACTGCCAG AATACGAAGAAGAATACGTAGAAATTGAAAAAGAAGAACCAGGCGAAGAAAAG AATCTTGAAGACGTAAGCACTGACGACAACAATGAAAAGAAGAAGATAAGGTCG GTGATTGTGAAAGAGACATAGAGGACACATGTAAGGTGGAAAATGTAAGGGCGG AAAGTAACCTTATCACAAAGGAATCTTATCCCCCACTACTTATCCTTTTATATTTT TCCGTGTCATTTTTGCCCTTGAGTTTTCCTATATAAGGAACCAAGTTCGGCATTTG TGAAAACAAGAAAAAATTTGGTGTAAGCTATTTTCTTTGAAGTACTGAGGATACA ACTTCAGAGAAATTTGTAAGTTTGTAGATCTCCATGGCTCCAAGGAAGAGGAAGG AGTCTAACAGGGAGTCAGCTAGGAGGTCAAGGTACAGGAAGGTGGGTATCCACG GGGTACCCGCCGCTATGGCTGAGAGGCCCTTCCAGTGTCGAATCTGCATGCGTAA CTTCAGTCGTAGTGACAACCTGAGCAACCACATCCGCACCCACACAGGCGAGAA GCCTTTTGCCTGTGACATTTGTGGGAGGAAATTTGCCACCAGCAGCAGCCGCATA AACCATACCAAGATACACACGGGCAGCCAAAAGCCCTTCCAGTGTCGAATCTGC ATGCGTAACTTCAGTCGTAGTGACAACCTGAGCGAACACATCCGCACCCACACAG GCGAGAAGCCTTTTGCCTGTGACATTTGTGGGAGGAAATTTGCCGCCAGCAAGAC CCGCAAAAACCATACCAAGATACACACGGGCGAGAAGCCCTTCCAGTGTCGAAT CTGCATGCGTAAGTTTGCCCGCTCCGACGCCCTGACCCAGCATGCCCAGAGATGC GGACTGCGGGGATCCCAACTTGTGAAATCAGAATTGGAAGAGAAAAAGTCTGAG CTTAGACACAAATTGAAGTACGTTCCACATGAATATATCGAACTTATCGAGATTG CTAGGAACTCAACACAGGACAGAATTTTGGAGATGAAGGTTATGGAGTTCTTTAT GAAAGTGTACGGATATAGGGGAAAGCACCTTGGTGGTTCTAGGAAACCTGATGG TGCAATCTACACTGTGGGATCACCTATTGACTATGGTGTTATCGTGGATACAAAG GCATACTCTGGTGGATACAATTTGCCAATCGGACAAGCTGACGAAATGCAGAGA TATGTTGAAGAGAACCAAACTAGAAACAAACATATTAATCCAAATGAATGGTGG AAGGTGTATCCTTCATCTGTTACAGAGTTCAAATTCCTTTTTGTGTCTGGACACTT TAAGGGTAACTACAAAGCACAGCTTACTAGGTTGAACCATATTACAAATTGCAAT GGTGCTGTGTTGTCAGTTGAAGAGCTTTTGATCGGAGGTGAAATGATTAAGGCAG GAACACTTACTTTGGAGGAAGTTAGAAGAAAATTCAACAACGGTGAAATCAATT TTAGATCTGGCGGCGGAGAGGGC AGAGGAAGTCTTCTAAC ATGCGGTGACGTGG AGGAGAATCCCGGCCCTAGGATGGCTCCAAGGAAGAGGAAGGAGTCTAACAGGG AGTCAGCTAGGAGGTCAAGGTACAGGAAGGTGGGTATCCACGGGGTACCCGCCG CTATGGCTGAGAGGCCCTTCCAGTGTCGAATCTGCATGCGTAACTTCAGTCGTAG TGACACCCTGAGCACGCACATCCGCACCCACACAGGCGAGAAGCCTTTTGCCTGT GACATTTGTGGGAGGAAATTTGCCGACAGGAGCAGCCGCATAAAGCATACCAAG ATACACACGGGATCTCAGAAGCCCTTCCAGTGTCGAATCTGCATGCGTAACTTCA GTCGCTCCGACGACCTGTCCAAGCACATCCGCACCCACACAGGCGAGAAGCCTTT TGCCTGTGACATTTGTGGGAGGAAGTTTGCCGACAACTCCAACCGCATCAAGCAT GCCCAGAGATGCGGACTGCGGGGATCCCAACTTGTGAAATCAGAATTGGAAGAG AAAAAGTCTGAGCTTAGACACAAATTGAAGTACGTTCCACATGAATATATCGAAC TTATCGAGATTGCTAGGAACTCAACACAGGACAGAATTTTGGAGATGAAGGTTAT GGAGTTCTTTATGAAAGTGTACGGATATAGGGGAAAGCACCTTGGTGGTTCTAGG AAACCTGATGGTGCAATCTACACTGTGGGATCACCTATTGACTATGGTGTTATCG TGGATACAAAGGCATACTCTGGTGGATACAATTTGCCAATCGGACAAGCTGACG AAATGCAGAGATATGTTGAAGAGAACCAAACTAGAAACAAACATATTAATCCAA ATGAATGGTGGAAGGTGTATCCTTCATCTGTTACAGAGTTCAAATTCCTTTTTGTG TCTGGACACTTTAAGGGTAACTACAAAGCACAGCTTACTAGGTTGAACCATATTA CAAATTGCAATGGTGCTGTGTTGTCAGTTGAAGAGCTTTTGATCGGAGGTGAAAT GATTAAGGCAGGAACACTTACTTTGGAGGAAGTTAGAAGAAAATTCAACAACGG TGAAATCAATTTTTGATAACTCGAGCTCGGTCACCAGCATAATTTTTATTAATGTA CTAAATTACTGTTTTGTTAAATGCAATTTTGCTTTCTCGGGATTTTAATATCAAAA TCTATTTAGAAATACACAATATTTTGTTGCAGGCTTGCTGGAGAATCGATCTGCTA TCATAAAAATTACAAAAAAATTTTATTTGCCTCAATTATTTTAGGATTGGTATTAA GGACGCTTAAATTATTTGTCGGGTCACTACGCATCATTGTGATTGAGAAGATCAG CGATACGAAATATTCGTAGTACTATCGATAATTTATTTGAAAATTCATAAGAAAA GCAAACGTTACATGAATTGATGAAACAATACAAAGACAGATAAAGCCACGCACA TTTAGGATATTGGCCGAGATTACTGAATATTGAGTAAGATCACGGAATTTCTGAC AGGAGCATGTCTTCAATTCAGCCCAAATGGCAGTTGAAATACTCAAACCGCCCCA TATGCAGGAGCGGATCATTCATTGTTTGTTTGGTTGCCTTTGCCAACATGGGAGTC CAAGGTT

[0575] SEQ ID NO:351

CCAGAAGGTAATTATCCAAGATGTAGCATCAAGAATCCAATGTTTACGGGAAAA ACTATGGAAGTATTATGTAAGCTCAGCAAGAAGCAGATCAATATGCGGCACATA TGCAACCTATGTTCAAAAATGAAGAATGTACAGATACAAGATCCTATACTGCCAG AATACGAAGAAGAATACGTAGAAATTGAAAAAGAAGAACCAGGCGAAGAAAAG AATCTTGAAGACGTAAGCACTGACGACAACAATGAAAAGAAGAAGATAAGGTCG GTGATTGTGAAAGAGACATAGAGGACACATGTAAGGTGGAAAATGTAAGGGCGG AAAGTAACCTTATCACAAAGGAATCTTATCCCCCACTACTTATCCTTTTATATTTT TCCGTGTCATTTTTGCCCTTGAGTTTTCCTATATAAGGAACCAAGTTCGGCATTTG TGAAAACAAGAAAAAATTTGGTGTAAGCTATTTTCTTTGAAGTACTGAGGATACA ACTTCAGAGAAATTTGTAAGTTTGTAGATCTCCATGGCTCCAAGGAAGAGGAAGG AGTCTAACAGGGAGTCAGCTAGGAGGTCAAGGTACAGGAAGGTGGGTATCCACG GGGTACCCGCCGCTATGGCTGAGAGGCCCTTCCAGTGTCGAATCTGCATGCGTAA CTTCAGTCAGTCCTCCGACCTGTCCCGCCACATCCGCACCCACACCGGCGAGAAG CCTTTTGCCTGTGACATTTGTGGGAGGAAATTTGCCCAGGCCGGCAACCTGTCCA AGCATACCAAGATACACACGCATCCCAGGGCACCTATTCCCAAGCCCTTCCAGTG TCGAATCTGCATGCGTAAGTTTGCCCAGTCCGGCGACCTGACCCGCCATACCAAG ATACACACGGGCGAGAAGCCCTTCCAGTGTCGAATCTGCATGCGTAACTTCAGTA CCTCCGGCTCCCTGTCCCGCCACATCCGCACCCACACCGGCGAGAAGCCTTTTGC CTGTGACATTTGTGGGAGGAAATTTGCCCAGTCCGGCAACCTGGCCCGCCATGCC CAGAGATGCGGACTGCGGGGATCCCAACTTGTGAAATCAGAATTGGAAGAGAAA AAGTCTGAGCTTAGACACAAATTGAAGTACGTTCCACATGAATATATCGAACTTA TCGAGATTGCTAGGAACTCAACACAGGACAGAATTTTGGAGATGAAGGTTATGG AGTTCTTTATGAAAGTGTACGGATATAGGGGAAAGCACCTTGGTGGTTCTAGGAA ACCTGATGGTGCAATCTACACTGTGGGATCACCTATTGACTATGGTGTTATCGTG GATACAAAGGCATACTCTGGTGGATACAATTTGCCAATCGGACAAGCTGACGAA ATGCAGAGATATGTTGAAGAGAACCAAACTAGAAACAAACATATTAATCCAAAT GAATGGTGGAAGGTGTATCCTTCATCTGTTACAGAGTTCAAATTCCTTTTTGTGTC TGGACACTTTAAGGGTAACTACAAAGCACAGCTTACTAGGTTGAACCATATTACA AATTGCAATGGTGCTGTGTTGTCAGTTGAAGAGCTTTTGATCGGAGGTGAAATGA TTAAGGCAGGAACACTTACTTTGGAGGAAGTTAGAAGAAAATTCAACAACGGTG AAATCAATTTTAGATCTGGCGGCGGAGAGGGCAGAGGAAGTCTTCTAACATGCG GTGACGTGGAGGAGAATCCCGGCCCTAGGATGGCTCCAAGGAAGAGGAAGGAGT CTAACAGGGAGTCAGCTAGGAGGTCAAGGTACAGGAAGGTGGGTATCCACGGGG TACCCGCCGCTATGGCTGAGAGGCCCTTCCAGTGTCGAATCTGCATGCGTAACTT CAGTACCTCCGGCTCCCTGTCCCGCCACATCCGCACCCACACCGGCGAGAAGCCT TTTGCCTGTGACATTTGTGGGAGGAAATTTGCCCTGCGCCAGACCCTGCGCGACC ATACCAAGATACACACGGGCAGCCAAAAGCCCTTCCAGTGTCGAATCTGCATGC GTAACTTCAGTACCTCCGGCAACCTGACCCGCCACATCCGCACCCACACCGGCGA GAAGCCTTTTGCCTGTGACATTTGTGGGAGGAAATTTGCCGACCGCTCCGCCCTG GCCCGCCATACCAAGATACACACGGGATCTCAGAAGCCCTTCCAGTGTCGAATCT GCATGCGTAACTTCAGTCGCTCCGACGTGCTGTCCGAGCACATCCGCACCCACAC CGGCGAGAAGCCTTTTGCCTGTGACATTTGTGGGAGGAAATTTGCCCGCAACTTC TCCCTGACCATGCATGCCCAGAGATGCGGACTGCGGGGATCCCAACTTGTGAAAT CAGAATTGGAAGAGAAAAAGTCTGAGCTTAGACACAAATTGAAGTACGTTCCAC ATGAATATATCGAACTTATCGAGATTGCTAGGAACTCAACACAGGACAGAATTTT GGAGATGAAGGTTATGGAGTTCTTTATGAAAGTGTACGGATATAGGGGAAAGCA CCTTGGTGGTTCTAGGAAACCTGATGGTGCAATCTACACTGTGGGATCACCTATT GACTATGGTGTTATCGTGGATACAAAGGCATACTCTGGTGGATACAATTTGCCAA TCGGACAAGCTGACGAAATGCAGAGATATGTTGAAGAGAACCAAACTAGAAACA AACATATTAATCCAAATGAATGGTGGAAGGTGTATCCTTCATCTGTTACAGAGTT CAAATTCCTTTTTGTGTCTGGACACTTTAAGGGTAACTACAAAGCACAGCTTACT AGGTTGAACCATATTACAAATTGCAATGGTGCTGTGTTGTCAGTTGAAGAGCTTT TGATCGGAGGTGAAATGATTAAGGCAGGAACACTTACTTTGGAGGAAGTTAGAA GAAAATTCAACAACGGTGAAATCAATTTTTGATAACTCGAGCTCGGTCACCAGCA TAATTTTTATTAATGTACTAAATTACTGTTTTGTTAAATGCAATTTTGCTTTCTCGG GATTTTAATATCAAAATCTATTTAGAAATACACAATATTTTGTTGCAGGCTTGCTG GAGAATCGATCTGCTATCATAAAAATTACAAAAAAATTTTATTTGCCTCAATTAT TTTAGGATTGGTATTAAGGACGCTTAAATTATTTGTCGGGTCACTACGCATCATTG TGATTGAGAAGATCAGCGATACGAAATATTCGTAGTACTATCGATAATTTATTTG AAAATTCATAAGAAAAGCAAACGTTACATGAATTGATGAAACAATACAAAGACA GATAAAGCCACGCACATTTAGGATATTGGCCGAGATTACTGAATATTGAGTAAGA TCACGGAATTTCTGACAGGAGCATGTCTTCAATTCAGCCCAAATGGCAGTTGAAA TACTCAAACCGCCCCATATGCAGGAGCGGATCATTCATTGTTTGTTTGGTTGCCTT TGCCAACATGGGAGTCCAAGGTT

[0576] SEQ ID NO:352

GCCCAAGGAACCCTTTTCTGGGCCATCTTCGTACTCGGCCACGACTGGTAATTTA ATGGATCCAACCGACAACCACTTTGCGGACTTCCTTTCAAGAGAATTCAATAAGG TTAATTCCTAATTGAAATCCGAAGATAAGATTCCCACACACTTGTGGCTGATATC AAAAGGCTACTGCCTATTTAAACACATCTCTGGAGACTGAGAAAATCAGACCTCC AAGCATGAAGAAGCCTGAGCTTACTGCTACTTCTGTTGAGAAGTTCCTCATCGAG AAGTTCGATTCTGTGTCTGATCTTATGCAGCTCTCTGAGGGTGAGGAATCAAGAG CTTTCTCTTTCGATGTTGGTGGAAGAGGATACGTTCTCAGAGTTAACTCTTGCGCT GACGGATTCTACAAGGATAGATACGTGTACAGACACTTCGCTTCAGCTGCTCTCC CTATCCCTGAAGTTCTTGATATCGGAGAGTTCTCTGAGTCTCTTACCTACTGTATC TCAAGAAGGGCTCAGGGTGTTACTCTTCAAGATCTTCCTGAGACTGAGCTTCCTG CTGTTCTTCAACCTGTTGCTGAGGCTATGGATGCTATCGCTGCTGCTGATCTTTCT CAAACTTCTGGATTCGGACCTTTCGGTCCTCAGGGAATCGGACAGTACACTACTT GGAGAGATTTCATCTGCGCTATCGCTGATCCTCATGTTTACCATTGGCAGACCGTT ATGGATGATACCGTTTCTGCTTCTGTTGCTCAAGCTCTTGATGAGCTTATGCTTTG GGCTGAGGATTGTCCTGAGGTTAGACATCTTGTTCACGCTGATTTCGGATCTAAC AACGTTCTCACCGATAACGGAAGAATCACCGCTGTTATCGATTGGTCTGAGGCTA TGTTCGGAGATTCTCAATACGAGGTGGCCAACATATTCTTTTGGAGGCCTTGGCTT GCTTGTATGGAACAACAGACTAGATACTTCGAGAGAAGGCATCCTGAGCTTGCTG GATCTCCTAGACTTAGAGCTTACATGCTTAGGATCGGACTTGATCAGCTTTACCA GTCTCTCGTTGATGGAAACTTCGATGATGCTGCTTGGGCTCAGGGAAGATGTGAT GCTATCGTTAGATCTGGTGCTGGAACTGTTGGAAGAACTCAAATCGCTAGAAGAT CTGCTGCTGTTTGGACTGATGGATGTGTTGAAGTTCTCGCTGATTCTGGAAACAG AAGGCCTTCTACTAGACCTAGAGCCAAGAAGTGAAGATCGGCGGCAATAGCTTC TTAGCGCCATCCCGGGTTGATCCTATCTGTGTTGAAATAGTTGCGGTGGGCAAGG CTCTCTTTCAGAAAGACAGGCGGCCAAAGGAACCCAAGGTGAGGTGGGCTATGG CTCTCAGTTCCTTGTGGAAGCGCTTGGTCTAAGGTGCAGAGGTGTTAGCGGGATG AAGCAAAAGTGTCCGATTGTAACAAGATATGTTGATCCTACGTAAGGATATTAAA GTATGTATTCATCACTAATATAATCAGTGTATTCCAATATGTACTACGATTTCCAA TGTCTTTATTGTCGCCGTATGTAATCGGCGTCACAAAATAATCCCCGGTGACTTTC TTTTAATCCAGGATGAAATAATATGTTATTATAATTTTTGCGATTTGGTCCGTTAT AGGAATTGAAGTGTGCTTGCGGTCGCCACCACTCCCATTTCATAATTTTACATGTA TTTGAAAAATAAAAATTTATGGTATTCAATTTAAACACGTATACTTGTAAAGAAT GATATCTTGAAAGAAATATAGTTTAAATATTTATTGATAAAATAACAAGTCAGGT ATTATAGTCCAAGCAAAAACATAAATTTATTGATGCAAGTTTAAATTCAGAAATA TTTCAATAACTGATTATATCAGCTGGTACATTGCCGTAGATGAAAGACTGAGTGC GATATTATGGTGTAATACATAGCGGCCGCGCCCAAGGAACCCTTTTCTGGGCCAT CTTCGTACTCGGCCACGACTGGTAATTTAAT

[0577] SEQ ID NO:353

gcccaaggaacccttttctgggccatcttcgtactcggccacgactggtaatttaatggatccactagtaacggccgccagtgtgctgga attcgcccttcgtcgacctgcaggtcaacggatcaggatattcttgttt^

gaccggataagttcccttcttcatagcgaacttattcaaagaatgtttt^

attggactgaacacgagtgttaaatatggaccaggccccaaataagatccattgatatatgaattaaataacaagaataaatcgagtcac caaaccacttgccttttttaacgagacttgttcaccaacttga

aacactaaaaaattaaaagaaatggataatttcacaatatgttatacgataaagaagttacttttccaagaaattcactgatt ^ cttgcattagataaatggcaaaaaaaaacaaaaaggaaaagaaataaagcacgaagaattctagaaaatacgaaatacgcttcaatgc agtgggacccacggttcaattattgccaattttcagctccacc^

gatcgttaaatctcaacggctggatcttatgacgaccgttagaaattgtggttgtcgacgagtcagtaataaacggcgtcaaagtggttgc agccggcacacacgagtcgtgtttatcaactcaaagcacaaatacttttcctcaacctaaaaataaggcaattagccaaaaacaacltt^ gtgtaaacaacgctcaatacacgtgtcattttattattagctattgcttcaccgcctogctttctcgtgacctagtcgtccto cttcttctataaaacaatacccaaagagctcttctte^

tgatcaaggtaaatttctgtgttcctottctctcaaa^^

atcttagatcgaagacgattttctgggtttgatcgttagatatcatctlaattctcgattagggtttcatagatatca

ttgagttttgfcgaataattactcttcgatttgtga

tttctgattaacagatgagaggatctggatctgagtctgatgagtctggacttcctgctatggaaatcgagtgtagaatcactggaaccctt aacggtgttgagttcgagcttgttggaggtggtgagggaactcctgagcagggaagaatgactaacaagatgaagtctaccaagggtg ctcteccttctctccateccttctttctcacgtta^

catgctatcaacaacggtggatacaccaacaclaggatcgagaagtacgaggatggtggtgttcttcacgt agcttctcttacagatac gaggctggaagagtgatcggagatttcaaggttatgggaactggattccctgaggattctgttatcttcaccgacaagatcatcaggtcta acgctactgttgagcatcttcatcctatgggagataacgatctcgatggatctttcaccagaaccttctcacttagagatggtggttactact cttctgtggtggattctcacatgcacttcaagtctgclatccacccttctatccttcaaaacggtggacctatgttcgctttc ggaagatcactctaacaccgagcttggaatcgttgagtaccaacatgctttcaagacccctgatgctgatgctggtgaggaatgataata tcaaaatctatttagaaatacacaatattttgttgcaggctt^^ aattattttaggattggtattaaggacgcttaaatotttgtcgggtcactacgcatcattgtgattgagaagatcagcgata^ tagtactatcgataatttatttgaaaattcataagaaaagcaaacgttacatgaattgatgaaacaatacaaagacagataaagccacgca catttaggatattggccgagattactgaatattgagtaagatcacggaatttctgacaggagcatgtcttcaattcagcccaaatggcagtt gaaatactcaaaccgccccatatgcaggagcggatcattcattgtttgtttggttgcctttgccaacatgggagtccaaggttgcggccgc gcccaaggaacccttttctgggccatcttcgtactcggccacgactggtaatttaat

[0578] SEQ ID NO:354

GCCCAAGGAACCCTTTTCTGGGCCATCTTCGTACTCGGCCACGACTGGTAATTTA ATGGATCCAACCGACAACCACTTTGCGGACTTCCTTTCAAGAGAATTCAATAAGG TTAATTCCTAATTGAAATCCGAAGATAAGATTCCCACACACTTGTGGCTGATATC AAAAGGCTACTGCCTATTTAAACACATCTCTGGAGACTGAGAAAATCAGACCTCC AAGCATGAAGAAGCCTGAGCTTACTGCTACTTCTGTTGAGAAGTTCCTCATCGAG AAGTTCGATTCTGTGTCTGATCTTATGCAGCTCTCTGAGGGTGAGGAATCAAGAG CTTTCTCTTTCGATGTTGGTGGAAGAGGATACGTTCTCAGAGTTAACTCTTGCGCT GACGGATTCTACAAGGATAGATACGTGTACAGACACTTCGCTTCAGCTGCTCTCC CTATCCCTGAAGTTCTTGATATCGGAGAGTTCTCTGAGTCTCTTACCTACTGTATC TCAAGAAGGGCTCAGGGTGTTACTCTTCAAGATCTTCCTGAGACTGAGCTTCCTG CTGTTCTTCAACCTGTTGCTGAGGCTATGGATGCTATCGCTGCTGCTGATCTTTCT CAAACTTCTGGATTCGGACCTTTCGGTCCTCAGGGAATCGGACAGTACACTACTT GGAGAGATTTCATCTGCGCTATCGCTGATCCTCATGTTTACCATTGGCAGACCGTT ATGGATGATACCGTTTCTGCTTCTGTTGCTCAAGCTCTTGATGAGCTTATGCTTTG GGCTGAGGATTGTCCTGAGGTTAGACATCTTGTTCACGCTGATTTCGGATCTAAC AACGTTCTCACCGATAACGGAAGAATCACCGCTGTTATCGATTGGTCTGAGGCTA TGTTCGGAGATTCTCAATACGAGGTGGCCAACATATTCTTTTGGAGGCCTTGGCTT GCTTGTATGGAACAACAGACTAGATACTTCGAGAGAAGGCATCCTGAGCTTGCTG GATCTCCTAGACTTAGAGCTTACATGCTTAGGATCGGACTTGATCAGCTTTACCA GTCTCTCGTTGATGGAAACTTCGATGATGCTGCTTGGGCTCAGGGAAGATGTGAT GCTATCGTTAGATCTGGTGCTGGAACTGTTGGAAGAACTCAAATCGCTAGAAGAT CTGCTGCTGTTTGGACTGATGGATGTGTTGAAGTTCTCGCTGATTCTGGAAACAG AAGGCCTTCTACTAGACCTAGAGCCAAGAAGTGAAGATCGGCGGC AATAGCTTC TTAGCGCCATCCCGGGTTGATCCTATCTGTGTTGAAATAGTTGCGGTGGGCAAGG CTCTCTTTCAGAAAGACAGGCGGCCAAAGGAACCCAAGGTGAGGTGGGCTATGG CTCTCAGTTCCTTGTGGAAGCGCTTGGTCTAAGGTGCAGAGGTGTTAGCGGGATG AAGCAAAAGTGTCCGATTGTAACAAGATATGTTGATCCTACGTAAGGATATTAAA GTATGTATTCATCACTAATATAATCAGTGTATTCCAATATGTACTACGATTTCCAA TGTCTTTATTGTCGCCGTATGTAATCGGCGTCACAAAATAATCCCCGGTGACTTTC TTTTAATCCAGGATGAAATAATATGTTATTATAATTTTTGCGATTTGGTCCGTTAT AGGAATTGAAGTGTGCTTGCGGTCGCCACCACTCCCATTTCATAATTTTACATGTA TTTGAAAAATAAAAATTTATGGTATTCAATTTAAACACGTATACTTGTAAAGAAT GATATCTTGAAAGAAATATAGTTTAAATATTTATTGATAAAATAACAAGTCAGGT ATTATAGTCCAAGCAAAAACATAAATTTATTGATGCAAGTTTAAATTCAGAAATA TTTCAATAACTGATTATATCAGCTGGTACATTGCCGTAGATGAAAGACTGAGTGC GATATTATGGTGTAATACATAGCGGCCGCAGCGAGAGAAAGCTTATTGCAACTTC AACTACTTGCTGGTCGATCGTGTTGGCCACTC

[0579] SEQ ID NO:355

GCCCAAGGAACCCTTTTCTGGGCCATCTTCGTACTCGGCCACGACTGGTAATTTA ATGGATCCACTAGTAACGGCCGCCAGTGTGCTGGAATTCGCCCTTCGTCGACCTG CAGGTCAACGGATCAGGATATTCTTGTTTAAGATGTTGAACTCTATGGAGGTTTG TATGAACTGATGATCTAGGACCGGATAAGTTCCCTTCTTCATAGCGAACTTATTC AAAGAATGTTTTGTGTATCATTCTTGTTACATTGTTATTAATGAAAAAATATTATT GGTCATTGGACTGAACACGAGTGTTAAATATGGACCAGGCCCCAAATAAGATCC ATTGATATATGAATTAAATAACAAGAATAAATCGAGTCACCAAACCACTTGCCTT TTTTAACGAGACTTGTTCACCAACTTGATACAAAAGTCATTATCCTATGCAAATC AATAATCATACAAAAATATCCAATAACACTAAAAAATTAAAAGAAATGGATAAT TTCACAATATGTTATACGATAAAGAAGTTACTTTTCCAAGAAATTCACTGATTTTA TAAGCCCACTTGCATTAGATAAATGGCAAAAAAAAACAAAAAGGAAAAGAAATA AAGCACGAAGAATTCTAGAAAATACGAAATACGCTTCAATGCAGTGGGACCCAC GGTTCAATTATTGCCAATTTTCAGCTCCACCGTATATTTAAAAAATAAAACGATA ATGCTAAAAAAATATAAATCGTAACGATCGTTAAATCTCAACGGCTGGATCTTAT GACGACCGTTAGAAATTGTGGTTGTCGACGAGTCAGTAATAAACGGCGTCAAAG TGGTTGCAGCCGGCACACACGAGTCGTGTTTATCAACTCAAAGCACAAATACTTT TCCTCAACCTAAAAATAAGGCAATTAGCCAAAAACAACTTTGCGTGTAAACAAC GCTCAATACACGTGTCATTTTATTATTAGCTATTGCTTCACCGCCTTAGCTTTCTC GTGACCTAGTCGTCCTCGTCTTTTCTTCTTCTTCTTCTATAAAACAATACCCAAAG AGCTCTTCTTCTTCACAATTCAGATTTCAATTTCTCAAAATCTTAAAAACTTTCTCT CAATTCTCTCTACCGTGATCAAGGTAAATTTCTGTGTTCCTTATTCTCTCAAAATC TTCGATTTTGTTTTCGTTCGATCCCAATTTCGTATATGTTCTTTGGTTTAGATTCTG TTAATCTTAGATCGAAGACGATTTTCTGGGTTTGATCGTTAGATATCATCTTAATT CTCGATTAGGGTTTCATAGATATCATCCGATTTGTTCAAATAATTTGAGTTTTGTC GAATAATTACTCTTCGATTTGTGATTTCTATCTAGATCTGGTGTTAGTTTCTAGTTT GTGCGATCGAATTTGTCGATTAATCTGAGTTTTTCTGATTAACAGATGAGAGGAT CTGGATCTGAGTCTGATGAGTCTGGACTTCCTGCTATGGAAATCGAGTGTAGAAT CACTGGAACCCTTAACGGTGTTGAGTTCGAGCTTGTTGGAGGTGGTGAGGGAACT CCTGAGCAGGGAAGAATGACTAACAAGATGAAGTCTACCAAGGGTGCTCTTACC TTCTCTCCATACCTTCTTTCTCACGTTATGGGATACGGATTCTACCACTTCGGAAC TTACCCATCTGGATACGAGAACCCTTTCCTTCATGCTATCAACAACGGTGGATAC ACCAACACTAGGATCGAGAAGTACGAGGATGGTGGTGTTCTTCACGTTAGCTTCT CTTACAGATACGAGGCTGGAAGAGTGATCGGAGATTTCAAGGTTATGGGAACTG GATTCCCTGAGGATTCTGTTATCTTCACCGACAAGATCATCAGGTCTAACGCTACT GTTGAGCATCTTCATCCTATGGGAGATAACGATCTCGATGGATCTTTCACCAGAA CCTTCTCACTTAGAGATGGTGGTTACTACTCTTCTGTGGTGGATTCTCACATGCAC TTCAAGTCTGCTATCCACCCTTCTATCCTTCAAAACGGTGGACCTATGTTCGCTTT CAGAAGAGTTGAGGAAGATCACTCTAACACCGAGCTTGGAATCGTTGAGTACCA ACATGCTTTCAAGACCCCTGATGCTGATGCTGGTGAGGAATGATAATATCAAAAT CTATTTAGAAATACACAATATTTTGTTGCAGGCTTGCTGGAGAATCGATCTGCTAT CATAAAAATTACAAAAAAATTTTATTTGCCTCAATTATTTTAGGATTGGTATTAAG GACGCTTAAATTATTTGTCGGGTCACTACGCATCATTGTGATTGAGAAGATCAGC GATACGAAATATTCGTAGTACTATCGATAATTTATTTGAAAATTCATAAGAAAAG CAAACGTTACATGAATTGATGAAACAATACAAAGACAGATAAAGCCACGCACAT TTAGGATATTGGCCGAGATTACTGAATATTGAGTAAGATCACGGAATTTCTGACA GGAGCATGTCTTCAATTCAGCCCAAATGGCAGTTGAAATACTCAAACCGCCCCAT ATGCAGGAGCGGATCATTCATTGTTTGTTTGGTTGCCTTTGCCAACATGGGAGTCC AAGGTTGCGGCCGCAGCGAGAGAAAGCTTATTGCAACTTCAACTACTTGCTGGTC GATCGTGTTGGCCACTC

[0580] SEQ ID NO:375

TTCTGGCCTCTTTATTGGGCCGCCCAAGGAACCCTTTTCTGGGCCATCTTCGTACT CGGCCACGACTGGTAATTTAATGGATCCACTAGTAA

[0581] SEQ ID NO:376

[0582] SEQ ID NO:377

TTCTGGCCTCTTTATTGGGCCGCCCAAGGAACCCTTTTCTGGGCCATC:CAGTCGT GGCCGAGTACGAAGATGGCCCAGA: : :TACTCGGCCACGACTGGTAATTTAATGGA TCCACTAGTAA [0583] SEQ E) NO:378

TTCTGGCCTCTTTATTGGGCCGCCCAAGGAACCCTTTTCTGGGCCATC : : : GTACTCG GCCACGACTGGTAATTTAATGGATCCACTAGTAA

[0584] SEQ BD NO:379

[0585] TTCTGGCCTCTTTATTGGGCCGCCCAAGGAACCCTTTTCTAGG: : : : : :T

ATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGCGTGCACGAAC : CGTA CTCGGCCACGACTGGTAATTTAATGGATCCACTAGTAA

[0586] SEQ ID NO:380

TTCTGGCCTCTTTATTGGGCCGCCCAAGGAACCCTTTTCTGGGCCA: :::::::::::::: : :GAC TGGTAATTTAATGGATCCACTAGTAA

[0587] SEQ ID NO:381

TCCAAGGTTGCGGCCGCGCCCAAGGAACCCTTTTCTGGGCCATCTTCGTACTCGG CCACGACTGGTAATTTAATTTTCAATTTATTT

[0588] SEQ ID NO:382

TCCAAGGTTGCGGCCGCGCCCAAGGAACCCTTTTCTGGGCCATiT: : : TACTCGGCC A CGACTGGTAATTTAATTTTCAATTTATTT

[0589] SEQ ID O:383

rCGTACTCGGCCACGACTGGTAATTTAATTTTCAATTTATTT

[0590] SEQ ID NO:384

TCCAAGGTTGCGGCCGCGCCCAAGGAACCCTTTTCTGGGCCATCTTCGTACT CGGCCACGACTGGTAATTTAATTTTCAATTTATTT

[0591] SEQ E) NO:385

TCCAAGGTTGCGGCCGCGCCCAAGGAACCCTTTTCTGGGCCATCTTC GTAATTTAATTTTCAATTTATTTTT

[0592] SEQ ID NO:386

TCCAAGGTTGCGGCCGCGCCCAAGGAACCCTTTTCTGG: : : : : :TAGCGGTGGTT

TTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGA

TCGTACTCGGCCACGACTGGTAATTTAATTTTCAATTTATTT

[0593] SEQ ID NO:387

TCCAAGGTTGCGGCCGCGCCCAAGGAACCCTTTTCTGGGCCATCTTACGAGC GTAATGGCTGGCCTGTTGAACAAGTCTGGAAAGAAATGCATAAACATATCCC AGCCACGACT: : :::::::::::::::: GGTAATTTAATTTTC AATTTATTT

[0594] SEQ ID O:388

TAGTTTATTTGCCCCAAGCGAGAGAAAGCTTATTGCAACTTCAACTCGTACTCGG

CCACGACTGGTAATTTAATGGATCCACTAGTAA

[0595] SEQ ID NO:389

TAGTTTATTTGCCCCAAGCGAGAGAAAGCTTATTGCAACTTCAACT

[0596] SEQ ID NO:390

TAGTTTATTTGCCCCAAGCGAGAGAAAGCTTATTGCAACTTCAACG [0597] SEQ ID O:391

TAGTTTATTTGCCCCAAGCGAGAGAAAGCTTATTGCAACTTCAACTTCGTACTCG GCCACGACTGGTAATTTAATGGATCCACTAGTAA

[0598] SEQ ID NO:392

TAGTTTATTTGCCCCAAGCGAGAGAAAGCTTATTGCAACTTCAACTAT::GTACTCG GCCACGACTGGTAATTTAATGGATCCACTAGTAA

[0599] SEQ ID NO:393

TAGTTTATTTGCCCCAAGCGAGAGAAAGCTTATTGCAACTTCA: : : : : TACTCGGCCA CGACTGGTAATTTAATGGATCCACTAGTAA

[0600] SEQ ID NO:394

AGGTAATTTAATGGATCCACTAGTAA

[0601] SEQ ID NO:395

TCCAAGGTTGCGGCCGCAGCGAGAGAAAGCTTATTGCAACTTCAACTACTTGCTG GTCGATCGTGTTG GCCACTCTTGTTTATCTATCA

[0602] SEQ ID NO:396

TCCAAGGTTGCGGCCGCAGCGAGAGAAAGCTTATTGCAACTTCA: : ACTTGCTGGT CGATCGTGTTGGCCACTCTTGTTTATCTATCA

[0603] SEQ ID NO:397

TCCAAGGTTGCGGCCGC ::::::::::::::::::::::::::::: GCGCCGACCCAGCTTTCTTGTACAAA GTTGGCATTATAAGAAAGCATTGCTTATCAATTTGTTGCAACGAACAGGTCACTA TCAGTCAAAACTTGCTGGTCGATCGTGTTGGCCACTCTTGTTTATCTATCA

[0604] SEQ ID NO:398

TCCAAGGTTTGCGGCCGCAGCGAGAGAAAGCTTATTGCAA: : : : : : CTTC ACTTGCTG GTCGATCGTGTTGGCCACTCTTGTTTATCTATCA

[0605] SEQ ID NO:399

TCCAAGGTTGCGGCCGC AGCGAGAGAAAGCTTATTGCAACTTCA: : GATAAAAGTT GCTCGCCTGTGTGGGTGTGGATGCT ACTTGCTGGTCGATCGTGTTGGCCACTCTT GTTTATCTATCA

[0606] SEQ ID NO:400

TCCAAGGTTGCGGCCGCAGCGAGAGAAAGCTTATTGCAACTTCAACTACACTACT TGCTGGTCGATCGTGTTGGCCACTCTTGTTTATCTATCA

[0607] SEQ ID NO:401

TCCAAGGTTGCGGCCGCAGCGAGAGAAAGCTTATTGCAACTTCAACTACTTGCTG GTCGATCGTGTTGGCCACTCTTGTTTATCTATCA

[0608] SEQ ID NO:402

CTTACATGCTTAGGATCGGACTTG

[0609] SEQ ED NO:403

AGTTCCAGCACCAGATCTAACG

[0610] SEQ ID NO:404

CCCTGAGCCCAAGCAGCATCATCG [0611] SEQ ID NO:405

CGGAGAGGGCGTGGAAGG

[0612] SEQ ID NO:406

TTCGATTTGCTACAGCGTCAAC

[0613] SEQ ID NO:407

AGGCACCATCGCAGGCTTCGCT

[0614] SEQ ID NO:408

TTCTGGCCTCTTTATTGGGCCGCCCAAGGAACCCTTTTCTGGGCCATCTTCGTACT CGGCCACGACTGGTAATTTAATGGATCCAACCGACAACCACTT

[0615] SEQ ID NO:409

TTCTGGCCTCTTTATTGGGCCGCCCAAGGAACCCTTT: :::::::::::: : : TACTCGGCCACG ACTGGTAATTTAATGGATCCAACCGACAACCACTT

[0616] SEQ DD NO:410

TTCTGGCCTCTTTATTGGGCCGCCCAAGGAACCCTTTTCTGG: : : : : : : TCGTACTCGGC CACGACTGGTAATTTAATGGATCCAACCGACAACCACTT

[0617] SEQ BD NO:411

TTCTGGCCTCTTTATTGGGCCGCCCAAGGAACCCTTTTCTGG

[0618] SEQ ID NO:412

TTCTGGCCTCTTTATTGGGCCGCCCAAGGAACCCTTTTCTGGGCCATCT: : : : : : :CGGC CACGACTGGTAATTTAATGGATCCAACCGACAACCACTT

[0619] SEQ ID NO:413

TTCTGGCCTCTTTATTGGGCCGCCCAAGGAACCCTTTTCTGGGCCAT::TCGTACTC GGCCACGACTGGTAATTTAATGGATCCAACCGACAACCACTT

[0620] SEQ ID NO:414

TTCTGGCCTCTTTATTGGGCCGCCCAAGGAACCCTTTTCTGGGCCATCT

[0621] SEQ ID NO:415

GTAATACATAGCGGCCGCGCCCAAGGAACCCTTTTCTGGGCCATCTTCGTACTCG GCCACGACTGGTAATTTAATTTTCAATTTATTTTTTCTTCAACTTCTTA

[0622] SEQ ID NO:416

GTAATACATAGCGGCCGCGCCCAAGGAACCCTTTTCTGGGCCAT: : ::::::: : :GCCACG ACTGGTAATTTAATTTTCAATTTATTTTTTCTTCAACTTCTTA

[0623] SEQ DD NO:417

GTAATACATAGCGGCCGCGCCCAAGGAACCCTTTTCTGGGCCATCT: : :::::::::: : :GAC TGGTAATTTAATTTTCAATTTATTTTTTCTTCAACTTCTTA

[0624] SEQ E) NO:418

GTAATACATAGCGGCCGCGCCCAAGGAACCCTTTTCTGGGCCATCT: :::::::::::: :GAC TGGTAATTTAATTTTCAATTTATTTTTTCTTCAACTTCTTA

[0625] SEQ ID O:419

GTAATACATAGCGGCCGCGCCCAAGGAACCCTTTTCTGGGCCAT::::GTACTCGGC CACGACTGGTAATTTAATTTTCAATTTATTTTTTCTTCAACTTCTTA [0626] SEQ ID NO:420

: rGTACTCGGCCACGACTGGTAATTTAATTTT: : ::::::: : : TCTTTC AACTTCTTA

[0627] SEQ ID NO:421

GTAATACATAGCGGCCGCGCCCAA: :::::::::::::::::::::: : : TACTCGGCCACGACTGGTAA TTTAATTTTCAATTTATTTTTTCTTCAACTTCTTA

[0628] SEQ ID NO:422

TGTAATACATAGCGGCCGCGCCC AAGGAACCCTTTACTCGGCCA: ::::::::::::::::::::: TA ATTTAATTTTCAATTTATTTTTTCTTCAACTTCTTA

[0629] SEQ ID NO:423

tnantgattc ccaatggcgg cgctttcaag gacatgatca tggagggtga tggcaggacctcgtactgaa atggtccgaa ggtccacgcc gccaactacg ag

[0630] SEQ ID NO:424

cnantactcg tagttggcgg cgtggacctt cggaccattt cagtacgagg tcctgccatcaccctccatg atcatgtcct tgaaagcgcc gccattggga at

[0631] SEQ ID NO:425

tnantgattc ccaatggcgg cgctttcaag gacatgatca tggagggtga tggcaggacc tcgtactgaa atttgcaggt acaag

[0632] SEQ ID NO:426

angngtcttg tacctgcaaa tttcagtacg aggtcctgcc atcaccctcc atgatcatgtccttgaaagc gccgccattg ggaat

[0633] SEQ ID NO:427

GTTTACCCGCCAATATATCCTGTCAAACACTGATAGTTTAAACTGAAGGCGGGAA ACGACAATCTGATCATGAGCGGAGAATTAAGGGAGTCACGTTATGACCCCCGCC GATGACGCGGGACAAGCCGTTTTACGTTTGGAACTGACAGAACCGCAACGTTGA AGGAGCCACTCAGCAAGCTTACTAGTAGCGCTGTTTAAACGCTCTTCAACTGGAA GAGCGGTTACCCGGACCGAAGCTTGCATGCCTGCAGTGCAGCGTGACCCGGTCGT GCCCCTCTCTAGAGATAATGAGCATTGCATGTCTAAGTTATAAAAAATTACCACA TATTTTTTTTGTCACACTTGTTTGAAGTGCAGTTTATCTATCTTTATACATATATTT AAACTTTACTCTACGAATAATATAATCTATAGTACTACAATAATATCAGTGTTTTA GAGAATCATATAAATGAACAGTTAGACATGGTCTAAAGGACAATTGAGTATTTTG ACAACAGGACTCTACAGTTTTATCTTTTTAGTGTGCATGTGTTCTCCTTTTTTTTTG CAAATAGCTTCACCTATATAATACTTCATCCATTTTATTAGTACATCCATTTAGGG TTTAGGGTTAATGGTTTTTATAGACTAATTTTTTTAGTACATCTATTTTATTCTATT TTAGCCTCTAAATTAAGAAAACTAAAACTCTATTTTAGTTTTTTTATTTAATAATT TAGATATAAAATAGAATAAAATAAAGTGACTAAAAATTAAACAAATACCCTTTA AGAAATTAAAAAAACTAAGGAAACATTTTTCTTGTTTCGAGTAGATAATGCCAGC CTGTTAAACGCCGTCGACGAGTCTAACGGACACCAACCAGCGAACCAGCAGCGT CGCGTCGGGCCAAGCGAAGCAGACGGCACGGCATCTCTGTCGCTGCCTCTGGACC CCTCTCGAGAGTTCCGCTCCACCGTTGGACTTGCTCCGCTGTCGGCATCCAGAAA TTGCGTGGCGGAGCGGCAGACGTGAGCCGGCACGGCAGGCGGCCTCCTCCTCCTC TCACGGCACCGGCAGCTACGGGGGATTCCTTTCCCACCGCTCCTTCGCTTTCCCTT CCTCGCCCGCCGTAATAAATAGACACCCCCTCCACACCCTCTTTCCCCAACCTCGT GTTGTTCGGAGCGCACACACACACAACCAGATCTCCCCCAAATCCACCCGTCGGC ACCTCCGCTTCAAGGTACGCCGCTCGTCCTCCCCCCCCCCCCCTCTCTACCTTCTC TAGATCGGCGTTCCGGTCCATGGTTAGGGCCCGGTAGTTCTACTTCTGTTCATGTT TGTGTTAGATCCGTGTTTGTGTTAGATCCGTGCTGCTAGCGTTCGTACACGGATGC GACCTGTACGTCAGACACGTTCTGATTGCTAACTTGCCAGTGTTTCTCTTTGGGGA ATCCTGGGATGGCTCTAGCCGTTCCGCAGACGGGATCGATTTCATGATTTTTTTTG TTTCGTTGCATAGGGTTTGGTTTGCCCTTTTCCTTTATTTCAATATATGCCGTGCAC TTGTTTGTCGGGTCATCTTTTCATGCTTTTTTTTGTCTTGGTTGTGATGATGTGGTC TGGTTGGGCGGTCGTTCTAGATCGGAGTAGAATTCTGTTTCAAACTACCTGGTGG ATTTATTAATTTTGGATCTGTATGTGTGTGCCATACATATTCATAGTTACGAATTG AAGATGATGGATGGAAATATCGATCTAGGATAGGTATACATGTTGATGCGGGTTT TACTGATGCATATACAGAGATGCTTTTTGTTCGCTTGGTTGTGATGATGTGGTGTG GTTGGGCGGTCGTTCATTCGTTCTAGATCGGAGTAGAATACTGTTTCAAACTACCT GGTGTATTTATTAATTTTGGAACTGTATGTGTGTGTCATACATCTTCATAGTTACG AGTTTAAGATGGATGGAAATATCGATGTAGGATAGGTATACATGTTGATGTGGGT TTTACTGATGCATATACATGATGGCATATGCAGCATCTATTCATATGCTCTAACCT TGAGTACCTATCTATTATAATAAACAAGTATGTTTTATAATTATTTTGATCTTGAT ATACTTGGATGATGGCATATGCAGCAGCTATATGTGGATTTTTTTAGCCCTGCCTT CATACGCTATTTATTTGCTTGGTACTGTTTCTTTTGTCGATGCTCACCCTGTTGTTT GGTGTTACTTCTGCAGGTCGACTCTAGAGGATCCACACGACACCATGTCCGCCCG CGAGGTGCACATCGACGTGAACAACAAGACCGGCCACACCCTCCAGCTGGAGGA CAAGACCAAGCTCGACGGCGGCAGGTGGCGCACCTCCCCGACCAACGTGGCCAA CGACCAGATCAAGACCTTCGTGGCCGAATCCAACGGCTTCATGACCGGCACCGA GGGCACCATCTACTACTCAATTAATGGCGAGGCCGAGATCAGCCTCTACTTCGAC AACCCGTTCGCCGGCTCCAACAAATACGACGGCCACTCCAACAAGTCCCAGTACG AGATCATCACCCAGGGCGGCTCCGGCAACCAGTCCCACGTGACCTACACCATCCA GACCACCTCCTCCCGCTACGGCCACAAGTCCTGAGTCATGAGTCATGAGTCAGTT AACCTAGACTTGTCCATCTTCTGGATTGGCCAACTTAATTAATGTATGAAATAAA AGGATGCACACATAGTGACATGCTAATCACTATAATGTGGGCATCAAAGTTGTGT GTTATGTGTAATTACTAGTTATCTGAATAAAAGAGAAAGAGATCATCCATATTTC TTATCCTAAATGAATGTCACGTGTCTTTATAATTCTTTGATGAACCAGATGCATTT CATTAACCAAATCCATATACATATAAATATTAATCATATATAATTAATATCAATT GGGTTAGCAAAACAAATCTAGTCTAGGTGTGTTTTGCGAATGCGGCCGCGGACCG AATTGGGGATCTGCATGAAAGAAACTGTCGCACTGCTGAACCGCACCTTGTCACT TTC ATCGAACACGACCTGTGCCCAAGATGACGGTGCTGCGGTCTAAGTGAGGCTG AATTGCCTTGGACAGAAGCGGACTCCCTACAATTAGTTAGGCCAAACGGTGCATC CATGTGTAGCTCCGGGCTCGGGCTGTATCGCCATCTGCAATAGCATCCATGGAGC TCGTTCCATGTAGTTGGAGATGAACCAATGATCGGGCGTGTGGACGTATGTTCCT GTGTACTCCGATAGTAGAGTACGTGTTAGCTCTTTCATGGTGCAAGTGAAATTTG TGTTGGTTTAATTACCCCTACGTTAGTTGCGGGACAGGAGACACATCATGAATTT AAAGGCGATGATGTCCTCTCCTGTAATGTTATTCTTTTGATGTGATGAATCAAAAT GTCATATAAAACATTTGTTGCTCTTTAGTTAGGCCTGATCGTAGAACGAAATGCT CGTGTAGCGGGGCTACGAGCCTATGACGCAATAACACTGGTTTGCCGGCCCGGA GTCGCTTGACAAAAAAAAGCATGTTAAGTTTATTTACAATTCAAAACCTAACATA TTATATTCCCTCAAAGCAGGTTCACGATCACACCTGTACCTAAAAAAAACATGAA GAATATATTACTCCATTATTATGAGATGAACCACTTGGCAAGAGTGGTAAGCTAT ATAAAAAAATGAACATTATTACGAGATGTTATATGCCATTATATTGATTCGAAGA TATATGTTTCTTTCTCCCACGGGCACCTAACGGATACATGATAAGGCCAAGGCAG ATCACGGGAAATTATTCGAATACATGTTACGCCCTATTGCCGGAAAAAAAATGCA GGGCAGGTGTTGGCCGTAGCGATTTAAGCACTTAAGCTGGAGGTTGCCACACTTG GATGCAAGCGTCTGACCCTTCTAAAAAATCGGCGGCTTTGTCCGTATCCGTATCC CCTATCCAACATCTAGCTGGCCACACGACGGGGCTGGGCAGATCGTGGATGCCG GGTCGACGTCGATCGTCAGCCATCATAGACCAATCGACCATCTGTTATGGATGCT TGCTAGCTAGACTAGTCAGACATAAAATTTGGATACTTTCTCCCAACTGGGAGAC GGGGACTGATGTGCAGCTGCACGTGAGCTAAATTTTTCCCTATAAATATGCATGA AATACTGCATTATCTTGCCACAGCCACTGCCACAGCCAGATAACAAGTGCAGCTG GTAGCACGCAACGCATAGCTCTGGACTTGTAGCTAGGTAGCCAACCGGATCCACA CGACACCATGCTCGACACCAACAAGGTGTACGAGATCAGCAACCACGCCAACGG CCTCTACGCCGCCACCTACCTCTCCCTCGACGACTCCGGCGTGTCCCTCATGAACA AGAACGACGACGACATCGACGACTACAACCTCAAGTGGTTCCTCTTCCCGATCGA CGACGACCAGTACATCATCACCTCCTACGCCGCCAACAACTGCAAGGTGTGGAAC GTGAACAACGACAAGATTAATGTGTCAACCTACTCCTCCACCAACTCCATCCAGA AGTGGCAGATCAAGGCCAACGGCTCCTCCTACGTGATCCAGTCCGACAACGGCA AGGTGCTCACCGCCGGCACCGGCCAGGCCCTCGGCCTCATCCGCCTCACCGACGA GTCCTCCAACAACCCGAACCAGCAATGGAACCTGACGTCCGTGCAGACCATCCA GCTCCCGCAGAAGCCGATCATCGACACCAAGCTCAAGGACTACCCGAAGTACTC CCCGACCGGCAACATCGACAACGGCACCTCCCCGCAGCTCATGGGCTGGACCCTC GTGCCGTGCATCATGGTGAACGACCCGAACATCGACAAGAACACCCAGATCAAG ACCACCCCGTACTACATCCTCAAGAAGTACCAGTACTGGCAGAGGGCCGTGGGCT CCAACGTCGCGCTCCGCCCGCACGAGAAGAAGTCCTACACCTACGAGTGGGGCA CCGAGATCGACCAGAAGACCACCATCATCAACACCCTCGGCTTCCAGATCAACAT CGACAGCGGCATGAAGTTCGACATCCCGGAGGTGGGCGGCGGTACCGACGAGAT CAAGACCCAGCTCAACGAGGAGCTCAAGATCGAGTATTCACATGAGACGAAGAT CATGGAGAAGTACCAGGAGCAGTCCGAGATCGACAACCCGACCGACCAGTCCAT GAACTCCATCGGCTTCCTCACCATCACCTCCCTGGAGCTCTACCGCTACAACGGC TCCGAGATCCGCATCATGCAGATCCAGACCTCCGACAACGACACCTACAACGTGA CCTCCTACCCGAACCACCAGCAGGCCCTGCTGCTGCTGACCAACCACTCCTACGA GGAGGTGGAGGAGATCACCAACATCCCGAAGTCCACCCTCAAGAAGCTCAAGAA GTACTACTTCTGAGTCATGAGTCATGAGTCAGTTAACCTAGACTTGTCCATCTTCT GGATTGGCCAACTTAATTAATGTATGAAATAAAAGGATGCACACATAGTGACAT GCTAATCACTATAATGTGGGCATCAAAGTTGTGTGTTATGTGTAATTACTAGTTAT CTGAATAAAAGAGAAAGAGATCATCCATATTTCTTATCCTAAATGAATGTCACGT GTCTTTATAATTCTTTGATGAACCAGATGCATTTCATTAACCAAATCCATATACAT ATAAATATTAATCATATATAATTAATATCAATTGGGTTAGCAAAACAAATCTAGT CTAGGTGTGTTTTGCGAATTCCCATGGAGTCAAAGATTCAAATAGAGGACCTAAC AGAACTCGCCGTAAAGACTGGCGAACAGTTCATACAGAGTCTCTTACGACTCAAT GACAAGAAGAAAATCTTCGTCAACATGGTGGAGCACGACACGCTTGTCTACTCCA AAAATATCAAAGATACAGTCTCAGAAGACCAAAGGGCAATTGAGACTTTTCAAC AAAGGGTAATATCCGGAAACCTCCTCGGATTCCATTGCCCAGCTATCTGTCACTT TATTGTGAAGATAGTGGAAAAGGAAGGTGGCTCCTACAAATGCCATCATTGCGAT AAAGGAAAGGCCATCGTTGAAGATGCCTCTGCCGACAGTGGTCCCAAAGATGGA CCCCCACCCACGAGGAGCATCGTGGAAAAAGAAGACGTTCCAACCACGTCTTCA AAGCAAGTGGATTGATGTGATATCTCCACTGACGTAAGGGATGACGCACAATCCC ACTATCCTTCGCAAGACCCTTCCTCTATATAAGGAAGTTCATTTCATTTGGAGAGG ACAGGGTACCCGGGGATCCACCATGTCTCCGGAGAGGAGACCAGTTGAGATTAG GCCAGCTACAGCAGCTGATATGGCCGCGGTTTGTGATATCGTTAACCATTACATT GAGACGTCTACAGTGAACTTTAGGACAGAGCCACAAACACCACAAGAGTGGATT GATGATCTAGAGAGGTTGCAAGATAGATACCCTTGGTTGGTTGCTGAGGTTGAGG GTGTTGTGGCTGGTATTGCTTACGCTGGGCCCTGGAAGGCTAGGAACGCTTACGA TTGGACAGTTGAGAGTACTGTTTACGTGTCACATAGGCATCAAAGGTTGGGCCTA GGATCCACATTGTACACACATTTGCTTAAGTCTATGGAGGCGCAAGGTTTTAAGT CTGTGGTTGCTGTTATAGGCCTTCCAAACGATCCATCTGTTAGGTTGCATGAGGCT TTGGGATACACAGCCCGGGGTACATTGCGCGCAGCTGGATACAAGCATGGTGGA TGGCATGATGTTGGTTTTTGGCAAAGGGATTTTGAGTTGCCAGCTCCTCCAAGGC CAGTTAGGCCAGTTACCCAGATCTGAGTCGACCTGCAGGCATGCCCGCTGAAATC ACCAGTCTCTCTCTACAAATCTATCTCTCTCTATAATAATGTGTGAGTAGTTCCCA GATAAGGGAATTAGGGTTCTTATAGGGTTTCGCTCATGTGTTGAGCATATAAGAA ACCCTTAGTATGTATTTGTATTTGTAAAATACTTCTATCAATAAAATTTCTAATTC CTAAAACCAAAATCCAGGGCGAGCTCGGTACCCGGGGATCCTCTAGAGTCGACC TGCAGGCATGCCCGCGGATATCGATGGGCCCCGGCCGAAGCTTCGGTCCGGGCC ATCGTGGCCTCTTGCTCTTCAGGATGAAGAGCTATGTTTAAACGTGCAAGCGCTC AATTCGCCCTATAGTGAGTCGTATTACAATCGTACGCAATTCAGTACATTAAAAA CGTCCGCAATGTGTTATTAAGTTGTCTAAGCGTCAATTTGTTTACACCACAATATA TCCTGCCA

[0634] SEQ ID NO:428

GAGGCCGACACGGCACACACGGCGACATTCACCGCCGGCTTCCTCCGTCGCCACT CGGCACAAGGCTCATCAGTCGCCGATGCCCGATGCGATCAACGGAAGCGGATGG CCCGCTTCTTTAGAATTGGCACAGGAACACTGGCCACTGCCCTTGATGTGCAATT ATGCCTGCGAAAGCCTAGGCAACACACGCGAATAAACGAGCGAATGACACGGAA AGCTGATGTGGTATGAATTATACAACATTATGGGCCAAAATATTATTCTATCCAC CATTGTGTAGCCACAGCATCGGTATTTGAGTTGTGCGAGGACAAATCCCTCGTGA GGTCAAAAACAGCAAATAATAAACCCATCTCCTGAAGACACCAAAAAAAAGGAG CAGCTCCTCGTGTCAATGAACAAGCGTCACAAGAAAAGGGAGCACGTAAATAAC CTCTTCAATTGCTTCAGCATGAAAAGAACGGGAAGAAATGCAAGTCTACAGAGG AAAGTGCAGCTGTTTCGGCTGCCATGGCAAGTTCCTACATGGGCGAGGAAAAGCT GAACTGGATTCCAGTCTTCGCGCTGTCATGCTCAGCTTGCTTTAGGATGCGGCAA TAGTTCACCTGGATGAAAAAGATACAAGTTAGTCTTGAAGCAGTCGAGTGGACAT CCAAAGTATCAAAATCGAAAGCTTGTAAATGGGGAAGGAAATATACCTCTACCC GGAAAAGTTTGGTAGGCAAAATAATCCCAACGCCAGCAGAGCTC

[0635] SEQ ID NO:429

CGTGCAAGCGCTCAATTCGCCCTATAGTGAGTCGTATTACAATCGTACGCAATTC AGTACATTAAAAACGTCCGCAATGTGTTATTAAGTTGTCTAAGCGTCAATTTGTTT ACACCAGAGGCCGACACGGCACACACGGCGACATTCACCGCCGGCTTCCTCCGTC GCCACTCGGCACAAGGCTCATCAGTCGCCGATGCCCGATGCGATCAACGGAAGC GGATGGCCCGCTTCTTTAGAATTGGCACAGGAACACTGGCCACTGCCCTTGATGT GCAATTATGCCTGCGAAAGCCTAGGCAACACACGCGAATAAACGAGCGAATGAC ACGGAAAGCTGATGTGGTATGAATTATACAACATTATGGGCCAAAATATTATTCT ATCCACCATTGTGTAGCCACAGCATCGGTATTTGAGTTGTGCGAGGACAAATCCC TCGTGAGGTCAAAAACAGCAAATAATAAACCCATCTCCTGAAGACACCAAAAAA AAGGAGCAGCTCCTCGTGTCAATGAACAAGCGTCACAAGAAAAGGGAGCACGTA AATAACCTCTTCAATTGCTTCAGCATGAAAAGAACGGGAAGAAATGCAAGTCTA CAGAGGAAAGTGCAGCTGTTTCGGCTGCCATGGCAAGTTCCTACATGGGCGAGG AAAAGCTGAACTGGATTCCAGTCTTCGCGCTGTCATGCTCAGCTTGCTTTAGGAT GCGGCAATAGTTCACCTGGATGAAAAAGATACAAGTTAGTCTTGAAGCAGTCGA GTGGACATCC AAAGTATCAAAATCGAAAGCTTGTAAATGGGGAAGGAAATATAC CTCTACCCGGAAAAGTTTGGTAGGCAAAATAATCCCAACGCCAGCAGAGCTC

[0636] SEQ ID NO:430

AGTTGGGAAGGCAAAACGAATATAAGTGCATTCGGATTACTGTTTAGTCGAGTCA TATTTAAGGAATTCATTGTAAATGTTCTAACCTAACCTAAGTATTAGGCAGCTAT GGCTGATATGGATCTGATTGGACTTGATTTATCCATGATAAGTTTAAGAGCAACT CAAAGAGGTTAGGTATATATGGTTTTGTAAAGGTAAATTTAGTTAATATTAGAAA AAAAAAGTGTATCCAATAGGCTCTATAAACAACTCTTCAAATTTAGTGGCTTTCT GTAGGTTTAATGAGTCTGTTGGATTAGCCTACACTTTTTCTGTAAAATCTATTTTA GATAGTAGCTAAATCAGTAAATTTGGCTAGTATTTTTAGCTATTCTCTTGGAGTTT GCTATAAGACCAGAACATGTAAATTGGAAGTTTGTGGACCCGGACGAGAATGCA TGACAAATCCAGAGTATTGATGATGGAATTCACCTATTTTACCCGACTCTTCCATT GTGTCCATTTCTCATCATCCCCGGGCGCTTTCTGCATCCGGTACAGCTGACATGAC ACGTTCACGCGTTACATGGCTGATGGCTCACAAGTCACCCCCACATGTCTAGTGT TCGCCCAGGCAGATCGTCCTCGGCCTGCGCTGCCGTGCTCTTGCCGCCGCTTGCTT GGGCCCTGCTGGCGCCCGCTGCCGATCACACGGCCTACGCGGTGCAGGCAGCGC CACCGAACCCGCAGTCTTGTTGTGCCGATAGGTGGCAGTGGCAGTGGCACTGGCA CGGCACGCGATCGATCGCTCCGCTCATCTGCTGACAGTGGATAGAGCAGCGTTGG CCGTTGGGGCCGGATCTCCGTGAAGCGGTCGTCCCTGCTGTACTGTGCCGCTATG GCGTGTCGCTTTCGCCATGTTTTCTTTTCTTTTTTTTTTCTTTTTCTTTTTGCTAGGG CGGTTTCTCGTTCGCTGGTAACAGGGACCACTTCGGTTGATCCGTTGAATTTACTG AAAGAGATGGGAATGGTCGCTGTGCCCGGGACATTGAATGAGATGTTGTGTAAG TGAATATGGCTTTAGCCTTTTGCGAGTGGGAATGGATGCTAAACGAACACAAACC GGGTTTAAACCAGAGGCCGACACGGCACACACGGCGACATTCACCGCCGGCTTC CTCCGTCGCCACTCGGCACAAGGCTCATCAGTCGCCGATGCCCGATGCGATCAAC GGAAGCGGATGGCCCGCTTCTTTAGAATTGGCACAGGAACACTGGCCACTGCCCT TGATGTGCAATTATGCCTGCGAAAGCCTAGGCAACACACGCGAATAAACGAGCG AATGACACGGAAAGCTGATGTGGTATGAATTATACAACATTATGGGCCAAAATA TTATTCTATCCACCATTGTGTAGCCACAGCATCGGTATTTGAGTTGTGCGAGGAC AAATCCCTCGTGAGGTCAAAAACAGCAAATAATAAACCCATCTCCTGAAGACAC CAAAAAAAAGGAGCAGCTCCTCGTGTCAATGAACAAGCGTCACAAGAAAAGGGA GCACGTAAATAACCTCTTCAATTGCTTCAGCATGAAAAGAACGGGAAGAAATGC AAGTCTACAGAGGAAAGTGCAGCTGTTTCGGCTGCCATGGCAAGTTCCTACATGG GCGAGGAAAAGCTGAACTGGATTCCAGTCTTCGCGCTGTCATGCTCAGCTTGCTT TAGGATGCGGCAATAGTTCACCTGGATGAAAAAGATACAAGTTAGTCTTGAAGC AGTCGAGTGGACATCCAAAGTATCAAAATCGAAAGCTTGTAAATGGGGAAGGAA ATATACCTCTACCCGGAAAAGTTTGGTAGGCAAAATAATCCCAACGCCAGCAGA GCTCCGGAACGTTTGCCGAAATTCAGAAGCCGAAAAGTTCTTGTACTCACCCTCC GACAGTTTCGCAAGGTTTCCAGCAGTAAGGAATGCGTGGCCATGGATTCCAGCGT CTCTGAATATCTTGAGGGGCAGATCAAAAGAAAGGTCAGCGAAGGCAGACACGG CCAGATCACCTCCCAAGTAATCCCTTCCAGGGTCAGCCGAGCCACTCTCCGAGTT ATTAAGGACATGCCTCCGCGCCTCTGTTGGGCCAACTCCCCTTAATCTGAAACCC AGCAGAGATGACGGTCCGCCCAAGCTGCACACTGGAGAAGAATTACCTCCAAGA TAAAACCTCTCTGGCACTGATGAAGTCGAATTCATGAATCCCCCTGCAAGCGGTA AAATGACACCCGCTCCTACACCAACGTTGAGAGCAGCACTATAAAATCCCAAAG GCACAGCACCACGTACATCGAACTCCTGAGAGCAAACCCAACGGCAATATTTTCT AACGGACACCAACCAGCGAACCAGCAGCGTCGCGTCGGGCCAAGCGAAGCAGAC GGCACGGCATCTCTGTCGCTGCCTCTGGACCCCTCTCGAGAGTTCCGCTCCACCGT TGGACTTGCTCCGCTGTCGGCATCCAGAAATTGCGTGGCGGAGCGGCAGACGTGA GCCGGCACGGCAGGCGGCCTCCTCCTCCTCTCACGGCACCGGCAGCTACGGGGG ATTCCTTTCCCACCGCTCCTTCGCTGTCCCTTCCTCGCCC

[0637] SEQ ID NO:431

AGTTGGGAAGGCAAAACGAATATAAGTGCATTCGGATTACTGTTTAGTCGAGTCA TATTTAAGGAATTCATTGTAAATGTTCTAACCTAACCTAAGTATTAGGCAGCTAT GGCTGATATGGATCTGATTGGACTTGATTTATCCATGATAAGTTTAAGAGCAACT CAAAGAGGTTAGGTATATATGGTTTTGTAAAGGTAAATTTAGTTAATATTAGAAA AAAAAAGTGTATCCAATAGGCTCTATAAACAACTCTTCAAATTTAGTGGCTTTCT ATCCATCCACCTTTGCTCTCTATTTTTGGATAGCCTGATTTACTCTCTATTCAGTCC GTAGGTTTAATGAGTCTGTTGGATTAGCCTACACTTTTTCTGTAAAATCTATTTTA GATAGTAGCTAAATCAGTAAATTTGGCTAGTATTTTTAGCTATTCTCTTGGAGTTT GCTATAAGACCAGAACATGTAAATTGGAAGTTTGTGGACCCGGACGAGAATGCA TGACAAATCCAGAGTATTGATGATGGAATTCACCTATTTTACCCGACTCTTCCATT GTGTCCATTTCTCATCATCCCCGGGCGCTTTCTGCATCCGGTACAGCTGACATGAC ACGTTCACGCGTTACATGGCTGATGGCTCACAAGTCACCCCCACATGTCTAGTGT TCGCCCAGGCAGATCGTCCTCGGCCTGCGCTGCCGTGCTCTTGCCGCCGCTTGCTT GGGCCCTGCTGGCGCCCGCTGCCGATCACACGGCCTACGCGGTGCAGGCAGCGC CACCGAACCCGCAGTCTTGTTGTGCCGATAGGTGGCAGTGGCAGTGGCACTGGCA CGGCACGCGATCGATCGCTCCGCTCATCTGCTGACAGTGGATAGAGCAGCGTTGG CCGTTGGGGCCGGATCTCCGTGAAGCGGTCGTCCCTGCTGTACTGTGCCGCTATG GCGTGTCGCTTTCGCCATGTTTTCTTTTCTTTTTTTTTTCTTTTTCTTTTTGCTAGGG CGGTTTCTCGTTCGCTGGTAACAGGGACCACTTCGGTTGATCCGTTGAATTTACTG AAAGAGATGGGAATGGTCGCTGTGCCCGGGACATTGAATGAGATGTTGTGTAAG TGAATATGGCTTTAGCCTTTTGCGAGTGGGGCGGCAATGCACGGCATGAACTATA ATTTCCGGTCAAACTTTTGTGTGGAAATGGATGCTAAACGAACACAAACCGGGTT TAAACCAGAGGCCGACACGGCACACACGGCGACATTCACCGCCGGCTTCCTCCGT CGCCACTCGGCACAAGGCTCATCAGTCGCCGATGCCCGATGCGATCAACGGAAG CGGATGGCCCGCTTCTTTAGAATTGGCACAGGAACACTGGCCACTGCCCTTGATG TGCAATTATGCCTGCGAAAGCCTAGGCAACACACGCGAATAAACGAGCGAATGA CACGGAAAGCTGATGTGGTATGAATTATACAACATTATGGGCCAAAATATTATTC TATCCACCATTGTGTAGCCACAGCATCGGTATTTGAGTTGTGCGAGGACAAATCC CTCGTGAGGTCAAAAACAGCAAATAATAAACCCATCTCCTGAAGACACCAAAAA AAAGGAGCAGCTCCTCGTGTCAATGAACAAGCGTCACAAGAAAAGGGAGCACGT AAATAACCTCTTCAATTGCTTCAGCATGAAAAGAACGGGAAGAAATGCAAGTCT ACAGAGGAAAGTGCAGCTGTTTCGGCTGCCATGGCAAGTTCCTACATGGGCGAG GAAAAGCTGAACTGGATTCCAGTCTTCGCGCTGTCATGCTCAGCTTGCTTTAGGA TGCGGCAATAGTTCACCTGGATGAAAAAGATACAAGTTAGTCTTGAAGCAGTCG AGTGGACATCCAAAGTATCAAAATCGAAAGCTTGTAAATGGGGAAGGAAATATA CCTCTACCCGGAAAAGTTTGGTAGGCAAAATAATCCCAACGCCAGCAGAGCTCC GGAACGTTTGCCGAAATTCAGAAGCCGAAAAGTTCTTGTACTCACCCTCCGACAG TTTCGCAAGGTTTCCAGCAGTAAGGAATGCGTGGCCATGGATTCCAGCGTCTCTG AATATCTTGAGGGGCAGATCAAAAGAAAGGTCAGCGAAGGCAGACACGGCCAG ATCACCTCCCAAGTAATCCCTTCCAGGGTCAGCCGAGCCACTCTCCGAGTTATTA AGGACATGCCTCCGCGCCTCTGTTGGGCCAACTCCCCTTAATCTGAAACCCAGCA GAGATGACGGTCCGCCCAAGCTGCACACTGGAGAAGAATTACCTCCAAGATAAA ACCTCTCTGGCACTGATGAAGTCGAATTCATGAATCCCCCTGCAAGCGGTAAAAT GACACCCGCTCCTACACCAACGTTGAGAGCAGCACTATAAAATCCCAAAGGCAC AGCACCACGTACATCGAACTCCTGAGAGCAAACCCAACGGCAATATTTTTGTAAT AGTGATGGTCAGAACTGAGAAGATCAGATAAAATTATACACTGATGCAATTATTT CATAGTTTCGCCCATGAACTGTAAGGGCTAGACAAAGCAAAAAGTAAGACATGA AGGGCAAGAGAATAACCTGCCGGAAATATCTCAATCCTTTGCTATTCCATAGACC ACCAACTTGAGAAGTTGACTGAAACGCATATCCTTTCGTTGGCCTAAG ATGTGAA TCCCTCTTATCAATCTTGTATGTGTACTTCAATGCAGAAAGAAGGTTATGCCCTAA CTGCCTCCTTATGGCCTTTGATGAGACACGTGATGGATCAGTTAAGGTACGCCAC GCAAGGTTGTATGACAAGTCATGGTTCCTTGTTGACAGCAAACCAAATGAAAGGC CAAGTAGGCGCTCCTTGTATGATGAAAACTTCAGCCAATCTTGTGATGACAAAGA TGCCCGAGCCATCAATGGTGTTGGTATTGATTTAAACCTCGGTAGGCAGACTCCA ACACCAACCTCTGTTGTTTGGTCCCAACCAAAGGATCCTGATGCATCCCAGATGT CACCATAGCCAAACAAGTTCTTCAACTTAAGTGACCCTTCCAGCGACCAAGATCT TGCCTACAAGAGTGGCAAGCACAGTCA

[0638] SEQ ID NO:464

GTGCATTCGGATTACTGTTTAGTCGAGTCATATTTAAGGAATTCATTGTAAATGTT CTAACCTAACCTAAGTATTAGGCAGCTATGGCTGATATGGATCTGATTGGACTTG ATTTATCCATGATAAGTTTAAGAGCAACTCAAAGAGGTTAGGTATATATGGTTTT GTAAAGGTAAATTTAGTTAATATTAGAAAAAAAAAGTGTATCCAATAGGCTCTAT AAACA

[0639] All patents, patent applications and publications mentioned herein hereby incorporated by reference in their entirety.

[0640] Although disclosure has been provided in some detail by way of illustration and example for the purposes of clarity of understanding, it will be apparent to those skilled in the art that various changes and modifications can be practiced without departing from the spirit or scope of the disclosure. Accordingly, the foregoing descriptions and examples should not be construed as limiting.

Claims

CLAIMS What is claimed is:

1. A double-stranded polynucleotide comprising:

an exogenous nucleic acid sequence;

a sequence comprising one or more target sites for one or more nucleases, wherein the one or more target sites are not within the exogenous nucleic acid sequence and further wherein the double-stranded polynucleotide does not comprise homology arms.

2. The double-stranded polynucleotide of claim 1, wherein the exogenous nucleic acid sequence is at least 1 kb in length.

3. The double-stranded polynucleotide of claim 1 or claim 2, wherein the exogenous nucleic acid sequence polynucleotide is a plasmid.

4. The double-stranded polynucleotide of any of claims 1 to 3, wherein the exogenous nucleic acid sequence comprises a transgene.

5. The double-stranded polynucleotide of any of claims 1 to 4, wherein the sequence comprises two target sites and a spacer of at least 5 nucleotides between the two target sites.

6. A cell comprising the double-stranded polynucleotide of any of claims 1 to 5.

7. A method of integrating an exogenous nucleic acid sequence

polynucleotide into an endogenous locus of a cell, the method comprising;

introducing a double-stranded polynucleotide according to any of claims 1 to 5 into the cell;

introducing one or more nucleases into the cell, wherein the nucleases cleave the double-stranded polynucleotide and cleave the endogenous locus such that the exogenous nucleic acid sequence polynucleotide is integrated into the endogenous locus.

8. The method of claim 7, wherein the exogenous nucleic acid sequence polynucleotide is integrated in a forward orientation.

9. The method of claim 7, wherein the exogenous nucleic acid sequence polynucleotide is integrated in a reverse orientation.

10. The method of any of claims 7 to 9, wherein the same nucleases cleave the endogenous locus and the donor polynucleotide.

11. The method of any of claims 7 to 9, wherein different nucleases cleave the endogenous locus and the donor polynucleotide.

12. The method of any of claims 7 to 11, wherein the exogenous nucleic acid sequence polynucleotide is integrated into the endogenous locus via homology- independent mechanisms.

13. The method of any of claims 7 to 12, wherein the nucleic acid comprising the two target sites and the spacer is non-naturally occurring such that the target sites are not re-created following integration of the exogenous nucleic acid sequence polynucleotide.

14. The method of any of claims 7 to 13, wherein the nucleases generate a deletion in the endogenous locus and the exogenous nucleic acid sequence polynucleotide is integrated into the deletion.

15. The method of any of claims 7 to 14, wherein the cell is a eukaryotic cell.

16. The method of claim 15, wherein the cell is plant or mammalian cell.

17. The method of claim 16, wherein the plant cell is a dicotyledonous or a monocotyledonous plant cell.

18. A transgenic organism comprising an exogenous nucleic acid sequence polynucleotide integrated according to the method of any of claims 7 to 17.