WO2022152192A1 - Enzyme involved in phage diaminopurine synthesis, and use thereof - Google Patents

Enzyme involved in phage diaminopurine synthesis, and use thereof Download PDF

Info

Publication number
WO2022152192A1
WO2022152192A1 PCT/CN2022/071726 CN2022071726W WO2022152192A1 WO 2022152192 A1 WO2022152192 A1 WO 2022152192A1 CN 2022071726 W CN2022071726 W CN 2022071726W WO 2022152192 A1 WO2022152192 A1 WO 2022152192A1
Authority
WO
WIPO (PCT)
Prior art keywords
polypeptide
seq
phage
bacteriophage
genome
Prior art date
Application number
PCT/CN2022/071726
Other languages
French (fr)
Chinese (zh)
Inventor
张雁
周彦
仝杨
Original Assignee
天津大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 天津大学 filed Critical 天津大学
Publication of WO2022152192A1 publication Critical patent/WO2022152192A1/en

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • AHUMAN NECESSITIES
    • A23FOODS OR FOODSTUFFS; TREATMENT THEREOF, NOT COVERED BY OTHER CLASSES
    • A23LFOODS, FOODSTUFFS, OR NON-ALCOHOLIC BEVERAGES, NOT COVERED BY SUBCLASSES A21D OR A23B-A23J; THEIR PREPARATION OR TREATMENT, e.g. COOKING, MODIFICATION OF NUTRITIVE QUALITIES, PHYSICAL TREATMENT; PRESERVATION OF FOODS OR FOODSTUFFS, IN GENERAL
    • A23L3/00Preservation of foods or foodstuffs, in general, e.g. pasteurising, sterilising, specially adapted for foods or foodstuffs
    • A23L3/34Preservation of foods or foodstuffs, in general, e.g. pasteurising, sterilising, specially adapted for foods or foodstuffs by treatment with chemicals
    • A23L3/3454Preservation of foods or foodstuffs, in general, e.g. pasteurising, sterilising, specially adapted for foods or foodstuffs by treatment with chemicals in the form of liquids or solids
    • A23L3/3463Organic compounds; Microorganisms; Enzymes
    • A23L3/3571Microorganisms; Enzymes
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K35/00Medicinal preparations containing materials or reaction products thereof with undetermined constitution
    • A61K35/66Microorganisms or materials therefrom
    • A61K35/74Bacteria
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K35/00Medicinal preparations containing materials or reaction products thereof with undetermined constitution
    • A61K35/66Microorganisms or materials therefrom
    • A61K35/76Viruses; Subviral particles; Bacteriophages
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P31/00Antiinfectives, i.e. antibiotics, antiseptics, chemotherapeutics
    • A61P31/04Antibacterial agents
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/005Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from viruses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/52Genes encoding for enzymes or proenzymes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/74Vectors or expression systems specially adapted for prokaryotic hosts other than E. coli, e.g. Lactobacillus, Micromonospora
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N7/00Viruses; Bacteriophages; Compositions thereof; Preparation or purification thereof
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/12Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
    • C12N9/1229Phosphotransferases with a phosphate group as acceptor (2.7.4)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/88Lyases (4.)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/93Ligases (6)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P19/00Preparation of compounds containing saccharide radicals
    • C12P19/26Preparation of nitrogen-containing carbohydrates
    • C12P19/28N-glycosides
    • C12P19/30Nucleotides
    • C12P19/32Nucleotides having a condensed ring system containing a six-membered ring having two N-atoms in the same ring, e.g. purine nucleotides, nicotineamide-adenine dinucleotide
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P19/00Preparation of compounds containing saccharide radicals
    • C12P19/26Preparation of nitrogen-containing carbohydrates
    • C12P19/28N-glycosides
    • C12P19/30Nucleotides
    • C12P19/34Polynucleotides, e.g. nucleic acids, oligoribonucleotides
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y207/00Transferases transferring phosphorus-containing groups (2.7)
    • C12Y207/04Phosphotransferases with a phosphate group as acceptor (2.7.4)
    • C12Y207/04008Guanylate kinase (2.7.4.8)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y301/00Hydrolases acting on ester bonds (3.1)
    • C12Y301/05Triphosphoric monoester hydrolases (3.1.5)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y306/00Hydrolases acting on acid anhydrides (3.6)
    • C12Y306/01Hydrolases acting on acid anhydrides (3.6) in phosphorus-containing anhydrides (3.6.1)
    • C12Y306/01009Nucleotide diphosphatase (3.6.1.9), i.e. nucleotide-pyrophosphatase
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y403/00Carbon-nitrogen lyases (4.3)
    • C12Y403/02Amidine-lyases (4.3.2)
    • C12Y403/02002Adenylosuccinate lyase (4.3.2.2)
    • AHUMAN NECESSITIES
    • A23FOODS OR FOODSTUFFS; TREATMENT THEREOF, NOT COVERED BY OTHER CLASSES
    • A23VINDEXING SCHEME RELATING TO FOODS, FOODSTUFFS OR NON-ALCOHOLIC BEVERAGES AND LACTIC OR PROPIONIC ACID BACTERIA USED IN FOODSTUFFS OR FOOD PREPARATION
    • A23V2002/00Food compositions, function of food ingredients or processes for food or foodstuffs
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2795/00Bacteriophages
    • C12N2795/00011Details
    • C12N2795/00021Viruses as such, e.g. new isolates, mutants or their genomic sequences
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2795/00Bacteriophages
    • C12N2795/00011Details
    • C12N2795/00031Uses of virus other than therapeutic or vaccine, e.g. disinfectant
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2795/00Bacteriophages
    • C12N2795/00011Details
    • C12N2795/00032Use of virus as therapeutic agent, other than vaccine, e.g. as cytolytic agent

Definitions

  • the present application relates generally to the fields of biochemistry and molecular biology, and in particular, the present application discloses bacteriophages having a diaminopurine (also known as 2,6-diaminopurine, 2-aminoadenine, or Z base)-containing genome
  • a diaminopurine also known as 2,6-diaminopurine, 2-aminoadenine, or Z base
  • Z base The Z base synthesis pathway, the enzymes involved and the applications of these enzymes.
  • the present application provides a polypeptide having 2-aminodeoxyadenosuccinate (ADAS) synthase activity, capable of producing one or more of ATP, dATP, GTP, dGTP, and dGMP and Asp are substrates to catalyze the formation of 2-aminodeoxyadenosuccinate succinate, and compared with the GDxxKG catalytic motif of adenylate succinate synthase (PurA), the catalytic motif of the polypeptide is changed to GSxxKG , where x represents any amino acid residue.
  • ADAS 2-aminodeoxyadenosuccinate
  • ADAS 2-aminodeoxyadenosuccinate
  • dGMP and Asp are common substrates
  • ATP, dATP, GTP, and dGTP are variable substrates.
  • the 2-aminodeoxyadenosuccinate (ADAS) synthase using ATP/dATP as substrate is called PurZ (PurZ was first identified)
  • the 2-aminodeoxyadenosuccinate (ADAS) synthase using GTP/dGTP as substrate The deoxyadenylate succinate (ADAS) synthase is called PurZ0 (PurZ0 was subdivided in subsequent studies where PurZ was identified).
  • position 303 corresponding to SEQ ID NO:72 is changed by R when the polypeptide is aligned with the amino acid sequence of adenylate succinate synthase from E. coli (SEQ ID NO:72) for L.
  • the polypeptide has T, N at position 306, F at position 307, and 309 corresponding to SEQ ID NO: 2 when aligned with the amino acid sequence set forth in SEQ ID NO: 2 bit N.
  • the polypeptide comprises the sequence set forth in any one of SEQ ID NOs: 1-4, 9-69, 71, and 92-146, or SEQ ID NOs: 1-4, 9-69, 71 and a variant of the sequence shown in any one of 92-146 having one or more amino acid insertions, deletions and/or substitutions that retain 2-aminodeoxyadenosuccinate (ADAS) synthetase activity, or SEQ ID Fragments of the sequences set forth in any of NO: 1-4, 9-69, 71 and 92-146 that retain the catalytic motif.
  • ADAS 2-aminodeoxyadenosuccinate
  • the present application provides a polypeptide, which has 2'-deoxyadenine 5'-triphosphate triphosphate hydrolase (dATPase) activity and can catalyze the hydrolysis of dATP to generate 2'-deoxyadenine (dA) , the polypeptide contains metal and ligand binding pockets.
  • dATPase 2'-deoxyadenine 5'-triphosphate triphosphate hydrolase
  • the polypeptide also catalyzes the hydrolysis of dADP and dAMP to 2'-deoxyadenine (dA).
  • the polypeptide comprises Co 2+ as a divalent metal cofactor.
  • the polypeptide comprises the sequence set forth in any one of SEQ ID NOs: 5-7 and 73-91, or the sequence set forth in any one of SEQ ID NOs: 5-7 and 73-91 A variant having one or more amino acid insertions, deletions and/or substitutions that retains dATPase activity, or a fragment of the sequence set forth in any of SEQ ID NOs: 5-7 and 73-91 that retains the catalytic motif.
  • the application provides a polypeptide having dATP and dGTP pyrophosphohydrolase activities, capable of catalyzing the hydrolysis of dATP to dAMP and dGTP to dGMP, the polypeptide comprising a metal and ligand binding pocket.
  • the polypeptide comprises Co 2+ as a divalent metal cofactor.
  • the polypeptide comprises the sequence set forth in SEQ ID NO: 8, or the sequence set forth in SEQ ID NO: 8 with one or more amino acid insertions, deletions and/or substitutions retained dATP and dGTP pyrophosphate A variant of hydrolase activity, or a fragment of the sequence set forth in SEQ ID NO:8 that retains the catalytic motif.
  • the polypeptides of the first to third aspects can all participate in the Z base synthesis pathway of phage, of which the polypeptide described in the first aspect is the most critical one.
  • the present application provides nucleic acid molecules encoding the polypeptides of the first to third aspects.
  • the present application provides a vector comprising the nucleic acid molecule of the fourth aspect.
  • the present application provides a method for modifying a phage, comprising introducing a nucleic acid molecule encoding the polypeptide of the first aspect into the genome of the phage, and expressing the polypeptide of the first aspect by the phage.
  • the method further comprises introducing into the genome of the bacteriophage a nucleic acid molecule encoding the polypeptide of the second aspect, expressing the polypeptide of the second aspect with the bacteriophage; and/or into the genome of the bacteriophage A nucleic acid molecule encoding the polypeptide of the third aspect is introduced, and the polypeptide of the third aspect is expressed by the phage.
  • the method further comprises introducing a nucleic acid molecule encoding an adenosuccinate lyase (PurB) into the genome of the bacteriophage, expressing the adenosuccinate lyase (PurB) with the bacteriophage; and/or A nucleic acid molecule encoding GMP kinase (GK) is introduced into the genome of the phage to express GMP kinase (GK) from the phage.
  • PurB adenosuccinate lyase
  • GK GMP kinase
  • the present application provides a phage obtained by the method of the sixth aspect.
  • the present application provides a host cell comprising the phage of the seventh aspect.
  • the host cell is a bacterial cell.
  • the application provides the phage of the seventh aspect or the host cell of the eighth aspect in diaminopurine deoxyribonucleotide (dZTP) synthesis, DNA synthesis, DNA origami, DNA-based data storage, Use in the preparation of antibacterials, bactericides or preservatives.
  • dZTP diaminopurine deoxyribonucleotide
  • FIG. 1 Sequence and structural analysis of PurZ.
  • A Equivalent biosynthetic pathways of AMP and dZMP;
  • B Sequence similarity network of putative PurZ;
  • C Homology model of SbPurZ;
  • D NMP binding site in SbPurZ and EcPurA;
  • E NMP binding site Sequence ID;
  • F Purine-interacting residues in the NTP binding site of PurZ/EcPurA;
  • G Sequence ID of the NTP binding site; residue numbers from EcPurA and Cp/SbPurZ.
  • FIG. 1 Substrate specificity of SbPurZ, ApdATPase and ApDUF550.
  • A substrate specificity of SbPurZ;
  • B complete reaction of SbPurZ, dGMP, ATP and Asp;
  • C reaction of Asp, dGMP and ATP, omitting SbPurZ;
  • D two-step reaction combining SbPurZ and EcPurB.
  • FIG. 3 Putative biosynthetic pathways of phage-containing Z genomes.
  • A Genomic neighborhood of PurZ in phage;
  • B Biosynthetic pathway containing Z genome;
  • C Substrate specificity of ApdATPase;
  • D ApDUF550.
  • Figure 4 Validation of Z base incorporation in the genome of Acinetobacter phage SH-Ab 15497.
  • FIG. 7 The putative catalytic mechanism of SbPurZ. Atoms derived from dGMP, Asp and ATP are highlighted in different shades of gray, respectively.
  • FIG. 8 Sequence similarity network of the PurA superfamily. Networks are shown with E-values of 10–160 , where each node represents sequences with ⁇ 85% sequence identity.
  • FIG. 9 SDS-PAGE gel analysis of recombinant proteins purified in this application.
  • Panels A-J show 4-20% SDS gels with molecular weight markers in lane 1 and 1, 2, 4 ⁇ g of purified protein in lanes 2, 3 and 4.
  • ApdATPase is a fusion protein containing an N-terminal MBP.
  • FIG. 10 PurZ enzyme activity assay.
  • A Time-dependent UV-Vis spectra of the SbPurZ assay;
  • B-D UV-Vis spectra of the ApPurZ, SpPurZ and VpPurZ assays;
  • E UV-Vis difference spectra of the SbPurZ assay, from each spectrum collected at different time points UV-Vis spectra at time 0 were subtracted;
  • F SbPurZ-catalyzed time-dependent phosphate release, detected using a phosphomolybdate assay.
  • G pH dependence of SbPurZ activity
  • H Determination of the extinction coefficient ( ⁇ ) of ADAS at 287 nm
  • I ESI-MS detection of 6-phosphoryl-dGMP formation of SbPurZ in the absence of Asp (ie, negative control)
  • J Quantification of phosphate released in the complete SbPurZ reaction mixture and in the reaction mixture with Asp omitted
  • K mass spectrometry analysis of pure dGMP, ATP and ADP
  • L MS2 data of ADAS, dZMP and dGMP in Figure 2, B and D .
  • FIG. 11 Enzyme kinetics analysis of PurZ.
  • A-D dGMP, ATP, and Asp were used as substrates, and in E, dIMP, ATP, Asp were used as substrates.
  • FIG. 12 ESI-MS analysis of the Sp/ApPurZ-EcPurB-SeGK reaction.
  • Figure 13 Sequence and structural analysis of dATPases.
  • A Metal and dATP binding motifs in dATPase homologues;
  • B Homology model of CpdATPase docked to dATP. Bases and metal-binding residues are marked in grayscale, respectively.
  • FIG. 14 ESI-MS detection of dA formation by CpdATPase and ApdATPase.
  • Figure 15 Enzymatic activity and enzyme kinetics determination of dATPase.
  • A Standard curves for phosphate, pyrophosphate and triphosphate.
  • B Metal-dependent colorimetric phosphate assay results of Cp/SpdATPase;
  • C Substrate specificity of Cp/SpdATPase;
  • D-F Kinetic constants of Cp/Sp/ApdATPase with dATP, dADP or dAMP as substrate.
  • FIG. 16 Colorimetric phosphate assay of Cp/Sp/ApdATPase.
  • A-C Reaction of Cp/Sp/ApdATPase with dATP, dADP and dAMP, respectively.
  • FIG. 17 Colorimetric and mass spectrometric characterization of ApDUF550-catalyzed reactions.
  • A 14-20% SDS gel results of purified ApDUF550, wherein lane 1 is the molecular weight marker, and lanes 2-4 are 1, 2, and 4 ⁇ g of purified ApDUF550;
  • B Metal-dependent colorimetric phosphate assay results of ApDUF550 ;
  • C and D ESI(-) m/z spectra of dATP and dGTP involved in the reaction catalyzed by ApDUF550 to form dAMP and dGMP, respectively; ApDUF550 was omitted for negative control.
  • FIG. 18 ApPurZ PCR product with A and Z.
  • A UV-Vis spectrum of ApPurZ PCR product containing A and Z;
  • B Agarose gel analysis of ApPurZ PCR product containing A and Z.
  • Figure 20 Changes in nanopore signal upon Z substitution.
  • A ApPurZ readings containing A match the expected signal;
  • B corresponding ApPurZ readings containing Z show significant signal change compared to the expected signal;
  • C When using ApPurZ readings containing Z as input data and containing A
  • the A to Z modifications can be detected by Tombo (43) when the ApPurZ readings are used as reference data.
  • the KS test uses the largest vertical difference between two cumulative distribution curves.
  • the D-statistic describes the difference between two sets of data, with a larger D-statistic representing a larger probability of modified bases.
  • Figure 21 Distribution of cwDTW alignment scores for A- and Z-containing ApPurZ.
  • FIG. 22 Restriction endonuclease digestion of phage genomic DNA and Z-containing PCR products.
  • A Restriction maps of Sau3AI and TaqI on the genome of Acinetobacter phage SH-Ab 15497;
  • B and C Restriction enzymes used to digest phage genomic DNA and Z-containing ApPurZ PCR product, respectively.
  • FIG. 23 Substrate specificity of GpPurZ0.
  • A Substrate specificity of GpPurZ0;
  • B Absorbance at 220nm-340nm was detected in the whole reaction of GpPurZ0 every 20 minutes;
  • C GpPurZ0 colorimetric phosphate assay using GTP/dGTP as substrates, respectively.
  • Figure 24 Results of HPLC-MS detection of GpPurZ0 and EcPurB activities.
  • A GpPurZ0, reaction equation of PurB;
  • B Asp, dGMP and GTP reaction, omitting GpPurZ0;
  • C GpPurZ0, dGMP and GTP reaction, omitting Asp;
  • D GpPurZ0, dGMP, GTP and Asp complete reaction;
  • E Two-step reaction combining GpPurZ0 and EcPurB.
  • FIG. 25 GpPurZ0 crystal structure elucidation.
  • A Overall structure of GpPurZ0;
  • B GpPurZ0 active center interacting amino acids and the electron cloud of the substrate GTP (hydrolyzed to GDP);
  • C Overall structure comparison of GpPurZ0 and VpPurZ (Vibrio phage phiVC8);
  • D GpPurZ0 and Active center comparison of the crystal structure of VpPurZ.
  • FIG. 26 Expression plasmid construction of the eukaryotic yeast Z-genome. Left: The gene map of the two proteins, ApPurZ and dATPase, which were recombinantly cloned into plasmid pRS426 (pRS426-ApPurZ-dATPase); Right: The gene map of DUF550 was recombinantly cloned into plasmid pRS425 (pRS426-DUF550) .
  • Figure 27 Detection of dZ in yeast genomes induced to express two plasmids containing pRS426-ApPurZ-dATPase and pRS426-DUF550.
  • Panel A LC-UV detection of dZ in yeast genome; B ion chromatogram of extracted dZ. Sequence Brief Description
  • SEQ ID NO: 1 shows the 2-aminodeoxyadenosuccinate (ADAS) synthase of the Acinetobacter phage (Acinetobacter phage, abbreviated herein as Ap) SH-Ab 15497 identified and tested herein (abbreviated herein as PurZ) amino acid sequence.
  • ADAS 2-aminodeoxyadenosuccinate
  • SEQ ID NO: 2 shows the amino acid sequence of PurZ of the Sinobacteraceae bacterium (abbreviated herein as Sb) bacteriophage contigs identified and tested herein.
  • SEQ ID NO: 3 shows the amino acid sequence of PurZ of the Salmonella phage (Salmonella phage, abbreviated as Sp) identified and tested herein.
  • SEQ ID NO: 4 shows the amino acid sequence of PurZ of the Vibrio phage (abbreviated herein as Vp) identified and tested in the present application.
  • SEQ ID NO: 5 shows the amino acid sequence of the 2'-deoxyadenine 5'-triphosphate triphosphate hydrolase (abbreviated herein as dATPase) of the Acinetobacter phage (Ap) SH-Ab 15497 identified and tested herein.
  • dATPase 2'-deoxyadenine 5'-triphosphate triphosphate hydrolase
  • SEQ ID NO: 6 shows the amino acid sequence of the dATPase of Salmonella phage (Sp) identified and tested herein.
  • SEQ ID NO: 7 shows the amino acid sequence of the dATPase of the cyanobacteriophage (Cp) identified and tested herein.
  • SEQ ID NO: 8 shows the amino acid sequence of the dATP and dGTP pyrophosphohydrolase (abbreviated herein as DUF550) of Acinetobacter phage (Ap) SH-Ab 15497 identified and tested herein.
  • SEQ ID NOs: 9-69 show the amino acid sequences of PurZ or PurZO from different bacteriophages identified in this application, and their database entry information is as follows:
  • SEQ ID NO: 70 shows the amino acid sequence corresponding to PurZ in the genome sequencing results of cyanobacteriophage (Cp) reported in 2003, GenBank_AX955019.1.
  • SEQ ID NO: 71 shows the amino acid sequence corresponding to PurZ in the genome sequencing results of cyanobacteriophage (Cp) reported in 2018, SRA_SRR8295598.
  • SEQ ID NO: 72 shows the amino acid sequence of adenylate succinate synthase (abbreviated herein as PurA) of Escherichia coli (abbreviated herein as Ec) used in the analysis, comparison and testing of the present application.
  • PurA adenylate succinate synthase
  • Ec Escherichia coli
  • SEQ ID NOs: 73-91 show the amino acid sequences of dATPases from different bacteriophages identified in this application, and their database entry information is as follows:
  • ADAS 2-aminodeoxyadenosuccinate
  • nucleic acids are written left to right in 5' to 3' orientation; amino acid sequences are written left to right in amino to carboxy orientation, respectively.
  • Numerical ranges include the numbers defining the range.
  • Amino acids may be represented herein by either their commonly known three-letter symbols or the one-letter symbols recommended by the IUPAC-IUB Biochemical Nomenclature Committee.
  • nucleotides can be represented by generally accepted one-letter codes. The terms defined above are more fully defined by reference to the specification as a whole.
  • H10 refers to histidine (H) at position 10 after M.
  • polypeptide and protein are used interchangeably herein to refer to polymers of amino acid residues as well as variants and synthetic and naturally occurring analogs thereof. Accordingly, these terms apply to naturally-occurring amino acid polymers and their naturally-occurring chemical derivatives, as well as to synthetic non-naturally-occurring amino acids (such as chemical analogs of the corresponding naturally-occurring amino acids) in which one or more amino acid residues are ) of amino acid polymers.
  • Such derivatives include, for example, post-translational modifications and degradation products, including phosphorylated, glycosylated, oxidized, isomerized, carboxylated, and deaminated variants of polypeptide fragments.
  • enzyme active center refers to the part of the enzyme molecule that can directly bind to the substrate molecule and catalyze the chemical reaction of the substrate, and this part becomes the active center of the enzyme. It is generally believed that the active center is mainly composed of two functional sites: the first is the catalytic site, where the bond of the substrate is broken or a new bond is formed to undergo certain chemical changes; the second is the binding site, the substrate of the enzyme. The substance binds to the enzyme molecule by this site.
  • the functional site is composed of a few amino acid residues that are relatively close in the three-dimensional structure of the enzyme molecule or some groups on these residues, which may be far apart in the primary structure, or even located on different peptide chains , but close to each other in spatial conformation through the coiling and folding of peptide chains; for enzymes that require coenzymes, coenzyme molecules (such as metal ions Zn 2+ and/or Mn 2+ ) or a certain part of the structure of coenzyme molecules are also functional part of the part.
  • coenzyme molecules such as metal ions Zn 2+ and/or Mn 2+
  • amino acid refers to a compound in which a hydrogen atom on a carbon atom of a carboxylic acid is replaced by an amino group, and the amino acid molecule contains two functional groups, an amino group and a carboxyl group. It includes naturally occurring and non-naturally occurring amino acids as well as amino acid analogs and mimetics. Naturally occurring amino acids include the 20 (L)-amino acids used in protein biosynthesis as well as other amino acids such as 4-hydroxyproline, hydroxylysine, carboxylated Cystine, citrulline and ornithine.
  • Non-naturally occurring amino acids include, for example, (D)-amino acids, norleucine, norvaline, p-fluorophenylalanine, ethylthionine, and the like, which are known to those skilled in the art.
  • Amino acid analogs include modified forms of naturally and non-naturally occurring amino acids. Such modifications may include, for example, substitution of chemical groups and moieties on amino acids, or derivatization of amino acids.
  • Amino acid mimetics include, for example, organic structures that exhibit functionally similar properties, such as the charge and charge space properties of amino acids.
  • an organic structure that mimics arginine has a positively charged moiety located in a similar molecular space and having the same degree of mobility as the e-amino group of the side chain of a naturally occurring Arg amino acid.
  • Mimics also include constrained structures to maintain optimal steric and charge interactions of amino acids or amino acid functional groups.
  • One skilled in the art can determine what structures constitute functionally equivalent amino acid analogs and amino acid mimetics.
  • isoenzyme refers to an enzyme that catalyzes the same reaction in an organism but differs in molecular structure.
  • nucleic acid refers to mRNA, RNA, cRNA, eDNA or DNA, including single- and double-stranded forms of DNA.
  • the term generally refers to polymeric forms of nucleotides that are at least 10 bases in length, either ribonucleotides or deoxynucleotides or a modified form of either type of nucleotide.
  • nucleic acid contains the necessary information to direct translation of the nucleotide sequence into a particular protein. Use codons to express the information that encodes the protein.
  • a nucleic acid encoding a protein may contain untranslated sequences (eg, introns) located within the translation region of the nucleic acid or may lack such intervening untranslated sequences (eg, as in cDNA).
  • full-length sequence in reference to a particular polynucleotide or the protein it encodes refers to the entire nucleic acid sequence or the entire amino acid sequence having native (non-synthetic) endogenous sequence.
  • the full-length polynucleotide encodes the full-length, catalytically active form of this particular protein.
  • isolated refers to a polypeptide or nucleic acid, or a biologically active portion thereof, that is substantially or essentially free of the proteins or nucleic acids that normally accompany or react with the protein or nucleic acid as found in its naturally occurring environment. components.
  • the isolated polypeptide or nucleic acid is substantially free of other cellular material or culture medium, or when chemically synthesized, the isolated polypeptide or nucleic acid is substantially free of chemical precursors or other chemicals .
  • DNA it is understood to be “isolated” unless otherwise specified or clear from the context to be present in a portion of the genome.
  • expression vector as used herein is a recombinant or synthetically produced nucleic acid construct having a series of specific nucleic acid elements that allow transcription of a specific nucleic acid in a host cell.
  • host cell refers to a cell that receives a foreign gene in transformation and transduction (infection).
  • Host cells can be eukaryotic cells such as yeast cells or prokaryotic cells such as E. coli. In the context of host cells involving bacteriophages, host cells refer primarily to bacteria.
  • Diaminopurine (Z) is a unique exception because in cyanophages it completely replaces adenine and forms three hydrogen bonds with thymine.
  • Z Diaminopurine
  • the inventors identified dozens of phages with these enzymes worldwide, and the inventors used LC-UV and mass spectrometry to further validate the inclusion of a representative example of these phages, Acinetobacter phage SH-Ab 15497. Z genome.
  • Z-containing genomes One of the roles of Z-containing genomes is to confer evolutionary advantages on phages to evade host restriction endonuclease attack.
  • the discovery of a Z-genome-containing biosynthetic pathway has enabled large-scale production of Z-DNA (ie, A is replaced by a Z in conventional DNA) for a variety of applications.
  • the present application provides a polypeptide having 2-aminodeoxyadenosuccinate (ADAS) synthase activity, capable of producing one or more of ATP, dATP, GTP, dGTP, and dGMP and Asp are substrates to catalyze the formation of 2-aminodeoxyadenosuccinate succinate, and compared with the GDxxKG catalytic motif of adenylate succinate synthase (PurA), the catalytic motif of the polypeptide is changed to GSxxKG , where x represents any amino acid residue.
  • ADAS 2-aminodeoxyadenosuccinate
  • ADAS 2-Aminodeoxyadenoylate succinate
  • PurZ or PurZ0 is an enzyme identified and characterized for the first time by the inventors of the present application, and is a key enzyme involved in the synthesis of Z bases.
  • PurZ or PurZ0 and PurA known in the art to participate in the synthesis of adenine (A) belong to a generalized homologue, and the reaction mechanism is similar (see A in Figure 1), but the difference in substrate specificity determines the reaction between the two. different.
  • the catalytic motif of PurA has been characterized and revealed to be GDxxKG, for example, see positions 12-17 of EcPurA in Fig.
  • the catalytic motif is changed to GSxxKG, according to the inventors Analyses, the replacement of the bulkier D by the smaller S can accommodate the additional 2-amino group of the substrate, resulting in different substrate selectivity and specificity. Therefore, GSxxKG was shown to be the catalytic motif of PurZ or PurZ0. In some embodiments, the catalytic motif of PurZ or PurZO is GSTGKG.
  • polypeptide corresponding to position 303 of SEQ ID NO:72 consists of R is changed to L.
  • Sequence alignment is a technique commonly used by those skilled in the art and there are conventional alignment tools (eg, BLASTn, BLASTp, BLASTx, etc.). Furthermore, due to the homolog family to which PurZ or PurZ0 and PurA belong, alignment of the two is also easily achieved. PurZ or PurZ0 is compared to PurA, for example, in EcPurA, position 303 is changed from R to aliphatic amino acid L, which is consistent with the fact that the substrate of PurZ or PurZ0 is deoxyribonucleotide (rather than ribonucleotide) .
  • polypeptide when the polypeptide is aligned with the amino acid sequence set forth in SEQ ID NO: 2 (SbPurZ), it has T corresponding to position 274, N at position 306, position 307 of SEQ ID NO: 2 F and 309 bits of N.
  • T corresponding to position 274, N at position 306, position 307 of SEQ ID NO: 2 F and 309 bits of N.
  • the above sites are relatively important residues identified by the inventors as being able to exert some influence on the reactivity.
  • the polypeptide comprises the sequence set forth in any one of SEQ ID NOs: 1-4, 9-69, 71, and 92-146, or SEQ ID NOs: 1-4, 9-69, 71 and a variant of the sequence shown in any one of 92-146 having one or more amino acid insertions, deletions and/or substitutions that retain 2-aminodeoxyadenosuccinate (ADAS) synthetase activity, or SEQ ID Fragments of the sequences set forth in any of NO: 1-4, 9-69, 71 and 92-146 that retain the catalytic motif.
  • ADAS 2-aminodeoxyadenosuccinate
  • SEQ ID NO: 1-4 are examples of PurZ identified and tested in this application, and SEQ ID NO: 71 is the PurZ amino acid sequence of cyanobacterial phage.
  • SEQ ID NOs: 9-69 and 92-146 are the PurZ or PurZ0 sequences of other phages obtained by the inventors in the known phage gene database according to sequence similarity and phylogenetic tree analysis. After comparative analysis, they have consistent catalytic motifs and important residues with the PurZ or PurZ0 examples identified and tested in this application, and are expected to have PurZ or PurZ0 functions.
  • the inventors have fully characterized the structure and function of PurZ or PurZ0, identified catalytic motifs and important residues, based on which insertions, deletions and/or substitutions of one or more amino acids in the non-catalytic and binding domains Not expected to affect PurZ or PurZ0 functionality.
  • the number of amino acid insertions, substitutions and/or deletions is 1-30, preferably 1-20, more preferably 1-10, such as 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acid insertions, substitutions and/or deletions.
  • fragments resulting from appropriate truncations are expected to be functional.
  • the present application provides a polypeptide, which has 2'-deoxyadenine 5'-triphosphate triphosphate hydrolase (dATPase) activity and can catalyze the hydrolysis of dATP to generate 2'-deoxyadenine (dA) , the polypeptide contains metal and ligand binding pockets. In some embodiments, the polypeptide also catalyzes the hydrolysis of dADP and dAMP to 2'-deoxyadenine (dA).
  • dATPase 2'-deoxyadenine 5'-triphosphate triphosphate hydrolase
  • dATPase is another enzyme involved in Z genome biosynthesis discovered and identified by the inventors. Since the phage-infested host may produce the precursor dATP for the synthesis of A bases, one of the roles of dATPase includes promoting Z-containing gene group synthesis by specifically removing dATP and its precursor dADP, preventing the incorporation of A into the phage genome.
  • the polypeptide comprises Co 2+ as a divalent metal cofactor.
  • the polypeptide comprises the sequence set forth in any one of SEQ ID NOs: 5-7 and 73-91, or the sequence set forth in any one of SEQ ID NOs: 5-7 and 73-91 A variant having one or more amino acid insertions, deletions and/or substitutions that retains dATPase activity, or a fragment of the sequence set forth in any of SEQ ID NOs: 5-7 and 73-91 that retains the catalytic motif.
  • SEQ ID NOs: 5-7 are examples of dATPases identified and tested in this application. Likewise, the inventors have also characterized the catalytic mechanism of dATPase, on the basis of which variants and fragments with the same/similar dATPase function are expected.
  • SEQ ID NOs: 73-91 are the dATPase sequences of other phages obtained by the inventors in the known phage gene database according to sequence similarity and phylogenetic tree analysis. After comparative analysis, they have consistent catalytic motifs and important residues with the dATPase examples identified and tested in this application, and are expected to have dATPase functions.
  • the application provides a polypeptide having dATP and dGTP pyrophosphohydrolase activities, capable of catalyzing the hydrolysis of dATP to dAMP and dGTP to dGMP, the polypeptide comprising a metal and ligand binding pocket.
  • DUF550 is another enzyme involved in Z genome biosynthesis discovered and identified by the inventors. DUF550 can catalyze the hydrolysis of dGTP to dGMP, which is one of the substrates of PurZ or PurZ0, and the presence of DUF550 can increase the level of dZTP, while also depleting dATP to further promote the incorporation of Z.
  • the polypeptide comprises Co 2+ as a divalent metal cofactor.
  • the polypeptide comprises the sequence set forth in SEQ ID NO: 8, or the sequence set forth in SEQ ID NO: 8 with one or more amino acid insertions, deletions and/or substitutions retained dATP and dGTP pyrophosphate A variant of hydrolase activity, or a fragment of the sequence set forth in SEQ ID NO:8 that retains the catalytic motif.
  • SEQ ID NO: 8 is an example of DUF550 identified and tested herein. Likewise, the inventors have also characterized the catalytic mechanism of DUF550, and it is expected to obtain variants and fragments with the same/similar DUF550 function on this basis.
  • the polypeptides of the first to third aspects can all participate in the Z base synthesis pathway of phage, of which the polypeptide described in the first aspect is the most critical one.
  • dATPase and DUF550 do not participate in the synthesis reaction and contribute to the synthesis of Z-containing gene groups.
  • the present application provides nucleic acid molecules encoding the polypeptides of the first to third aspects.
  • Nucleic acid molecules that are coding sequences can be combined with other DNA sequences, such as promoters, polyadenylation signals, other restriction sites, multiple cloning sites, other coding segments, and the like.
  • Nucleic acids and fusions thereof can be prepared, manipulated and/or expressed using any of a variety of well-established techniques known and available in the art.
  • nucleic acid molecules encoding the polypeptides of the present application, or variants and fragments thereof can be used in recombinant DNA molecules to direct expression of the polypeptides in appropriate host cells. Due to the inherent degeneracy of the genetic code, other DNA sequences encoding substantially identical or functionally equivalent amino acid sequences can also be used in the present application, and these sequences can be used to clone and express a given polypeptide.
  • nucleic acid molecules of the present application can be engineered using methods well known in the art, including but not limited to cloning, processing, expression and/or alteration of activity of the gene product.
  • nucleic acid molecules are produced by artificial synthesis, such as direct chemical synthesis or enzymatic synthesis.
  • the nucleic acid molecule is produced by recombinant techniques.
  • the nucleic acid molecule is an isolated nucleic acid molecule.
  • the present application provides a vector comprising the nucleic acid molecule of the fourth aspect.
  • the nucleic acid molecule of the fourth aspect can be provided in a vector in the form of an expression cassette.
  • the expression cassette may additionally contain a 5' leader sequence that acts to enhance translation.
  • various DNA fragments can be manipulated to provide DNA sequences in the appropriate orientation and, where appropriate, in the appropriate reading frame.
  • linkers or linkers may be employed to join DNA fragments, or other manipulations may be involved to provide convenient restriction sites, remove excess DNA, remove restriction sites, and the like.
  • in vitro mutagenesis, primer repair, restriction, annealing, re-substitutions such as transitions and transversions may be involved.
  • the present application provides a method for modifying a phage, comprising introducing a nucleic acid molecule encoding the polypeptide of the first aspect into the genome of the phage, and expressing the polypeptide of the first aspect by the phage.
  • Phage engineering can be performed by those skilled in the art.
  • Z-base synthesis may be achieved through the introduction of PurZ or PurZ0.
  • the method further comprises introducing into the genome of the bacteriophage a nucleic acid molecule encoding the polypeptide of the second aspect, expressing the polypeptide of the second aspect with the bacteriophage; and/or into the genome of the bacteriophage A nucleic acid molecule encoding the polypeptide of the third aspect is introduced, and the polypeptide of the third aspect is expressed by the phage.
  • the method further comprises introducing a nucleic acid molecule encoding an adenosuccinate lyase (PurB) into the genome of the bacteriophage, expressing the adenosuccinate lyase (PurB) with the bacteriophage; and/or A nucleic acid molecule encoding GMP kinase (GK) is introduced into the genome of the phage to express GMP kinase (GK) from the phage.
  • PurB adenosuccinate lyase
  • GK GMP kinase
  • PurB and GK are required for gene assembly, but can also be provided by host bacteria.
  • the present application provides a phage obtained by the method of the sixth aspect.
  • the present application provides a host cell comprising the phage of the seventh aspect.
  • the host cell is a bacterial cell.
  • the application provides the phage of the seventh aspect or the host cell of the eighth aspect in diaminopurine deoxyribonucleotide (dZTP) synthesis, DNA synthesis, DNA origami, DNA-based data storage, Use in the preparation of antibacterials, bactericides or preservatives.
  • dZTP diaminopurine deoxyribonucleotide
  • the inventors report a Z-genome-containing biosynthetic system that can facilitate the production of Z-substituted DNA with potential subsequent use in a variety of emerging applications, such as DNA origami (23) and DNA-based data Archiving, which has great potential due to its high storage capacity (24).
  • Z-genome-containing biosynthetic enzymes into engineered phages can expand their host range and potency, for example, which have been successfully used clinically and branched from multidrug-resistant A. baumannii (25) or abscesses Mycobacterium abscessus (26) infection for life-saving phage therapy, food preservation (27) and environmental protection purposes.
  • Increasing base diversity has been a long-standing quest in the art (28-30), and the inventors' work shows that nature already provides such an approach.
  • the inventors' work may be useful for interdisciplinary research on the origin of life and astrobiology (31).
  • NGS sequencing technology
  • the inventors To identify candidate genes involved in Z-containing genome biosynthesis, the inventors annotated two versions of the cyanobacterial phage S-2L genome using the FgenesV/FgenesV0 software from the SofiBerry platform (http://www.softberry.com/).
  • the PurZ gene was identified from positions 15407-14232, the gene length was 1176bp, and it encoded 391aa protein.
  • the PurZ gene was identified as starting at position 15122 and ending at position 16202, with a gene length of 1080 bp and encoding a protein of 359 aa.
  • the inventors By aligning with other PurZ CDS, the inventors found that the difference between the two versions comes from the C-terminal indel, and the NGS version should be the correct version.
  • PurZ-containing contigs in metagenomes misannotated as bacterial origin may have phage origin
  • HHPred was used to discover templates for CpPurZ, SbPurZ and CpdATPase.
  • PDB 1CIB(EcPurA) (10) and 1MEZ(MmPurA) (33) were used as templates for modeling CpPurZ and SbPurZ, respectively, as these structures are related to all substrates/substrate mimics or products complex. Both models of PurZ bind three substrates: dGMP, ATP and Asp.
  • PDB:5TK7(34) was chosen as template to build the homology model of CpdATPase
  • PDB: 5TK7 is a Co-dependent oxetanocin-A triphosphate/monophosphate phosphatase or dATP/dAMP phosphatase, which only Has 18% overall sequence identity but shares the same metal-binding residues as CpdATPase.
  • SSN Sequence similarity network
  • the SSN of the PurA superfamily (Pfam accession number: PF00709) was generated using the web-based tool EFI-EST ( https://efi.igb.illinois.edu/efi-est/)(36 ). SSNs were displayed by Cytoscape 3.7.0 (37) with e-values of 10-160 , each node representing a set of sequences with >85% sequence identity. Seventy-six proteins (62 from phage, 12 from metagenome, and 2 from archaea) were putatively named PurZ. Their sequences were used to construct individual SSNs, shown with e-values of 10-70 . During the course of this application, an additional 56 additional PurZ sequences were discovered, shown in Table 1.
  • MAFFT version 7.450(38) was used to generate multiple sequence alignments of the PurA superfamily (Pfam: PF00709), the PurZ family, selected representative PurA/PurZ sequences and CpdATPase homologs. Based on multiple sequence alignments of PurA/PurZ sequences and dATPase sequences, key residues for ligand binding were used to generate sequence identity by WebLogo 3 (39).
  • GNNs Genome Neighborhood Networks
  • LB medium was purchased from Oxoid Limited (Hampshire, UK). Ultrapure deionized water from Millipore Direct-Q was used. TALON resin was purchased from Clontech Laboratories Inc (California, USA). All protein purification chromatography experiments in was performed on a pure FPLC system (GE Healthcare, USA). dZTP was purchased from Trilink (California, USA). Other nucleotides were purchased from Sigma-Aldrich.
  • Codon-optimized gene fragments for PurZ, dATase, EcPurB, SeGK, and ApDUF550 were synthesized by Genewiz Inc. (Suzhou, China) and inserted between the NdeI and BamHI restriction sites in pET-28a(+) for expression Protein with an N-terminal His 6 tag. Using the Gibson assembly cloning protocol, CpPurZ, ApdATPase and ApDUF550 were also inserted into the SspI restriction site of the pET-28a(+)-HMT vector.
  • the resulting plasmid tandem contains: His6-tag, maltose binding protein (MBP) and TEV protease cleavage site, followed by the construct of interest, confirmed by sequencing.
  • the EcPurB (UniProt: POAB89) gene was amplified by PCR (forward primer CCAGAGCGGATCAGGAATGGAATTATCCTCACTGACCG; reverse primer CCAATTGAGATCTGCCATATGTTATTTCAGCTCATCAACCATCG) and inserted into pET-28a(+) vector using Gibson assembly to express the protein with an N-terminal His6 tag.
  • E. coli BL21 (DE3) cells were transformed with plasmids encoding the PurZ, PurB, dATPase, SeGK and ApDUF550 genes and plated on LB agar supplemented with 50 ⁇ g/mL kanamycin. Transformants were grown in LB medium (300 mL) at 37°C in a shaking incubator at 220 rpm. When the OD600 reached about 0.8, the temperature was lowered to 18°C and isopropyl ⁇ -D-1-thiogalactopyranoside (IPTG) was added to a final concentration of 0.5 mM to induce the production of the protein of interest. After 16-20 hours, cells were harvested by centrifugation (6000 xg, 10 min at 4°C).
  • IPTG isopropyl ⁇ -D-1-thiogalactopyranoside
  • lysis buffer 50 mM Tris-HCl, pH 8.0, 1 mM phenylmethanesulfonyl fluoride, 0.2 mg/mL lysozyme, 0.03% Triton X-100 and 0.02 mg/mL DNase I.
  • the cell suspension was frozen in a -80°C freezer, then thawed, and incubated in a 25°C water bath for 30 minutes to allow cell lysis.
  • 1.5 mL of 11% streptomycin sulfate (in water) was added to the cell lysate, followed by gentle mixing and centrifugation (20000 xg, 10 min at 4°C).
  • the supernatant was filtered through a 0.22 ⁇ m filter and loaded onto a 5 mL TALON cobalt column pre-equilibrated with buffer A (20 mM HEPES (pH 7.5), 5 mM ⁇ -mercaptoethanol (BME) and 0.2 M KCl).
  • buffer A (20 mM HEPES (pH 7.5), 5 mM ⁇ -mercaptoethanol (BME) and 0.2 M KCl).
  • the column was washed with 10 column volumes (CV) of buffer A, followed by protein elution with 5 CV of buffer A containing 150 mM imidazole.
  • the eluate was dialyzed against 2 L of buffer B (20 mM HEPES (pH 7.5), 0.1 M KCl and 1 mM dithiothreitol) overnight to remove imidazole and then concentrated using a centrifugal concentrator (30 K MWCO; Millipore).
  • concentration of purified protein was calculated from absorbance at 280 nm using NANODROP ONE (Thermo SCIENTIFIC).
  • Purified proteins with commercial protein markers (Genstar, Shenzhen) were detected on 4-20% SDS polyacrylamide gradient gels and visualized by Coomassie staining.
  • a 50 ⁇ L reaction mixture containing 20 mM HEPES (pH 7.5), 2 mM dGMP, 1 mM ATP, 2 mM Mg2+ , 5 mM Asp-Na + and 5 ⁇ M Sb/Sp/Vp/ApPurZ was incubated at room temperature (RT) for 0-6 h. Absorbance from 220 to 340 nm was monitored using NANODROP ONE (Thermo SCIENTIFIC). Phosphate (13) was quantified using a colorimetric phosphomolybdate assay.
  • a 300 ⁇ L reaction mixture containing 20 mM Tris-HCl (pH 8.0), 1 mM dGMP, 0.5 mM ATP, 2 mM Mg 2+ , 5 mM Asp-Na + and 5 ⁇ M SbPurZ was incubated for 1 hour at room temperature. Two negative controls were additionally prepared omitting SbPurZ or Asp-Na + .
  • the reaction was then applied to a centrifuge concentrator (3K MWCO; Millipore).
  • ESI-MS/MS analysis of the flow through was performed using a Q Exactive TM HF/UltiMate TM 3000RSLCnano (Thermo Fisher) instrument.
  • the sample loading volume was 5 ⁇ L, and the sample loading rate was 0.2 mL/min.
  • the SbPurZ assay was additionally repeated with the inclusion of 5 ⁇ M of EcPurB.
  • a 300 ⁇ L reaction mixture containing 10 mM Tris-HCl (pH 7.5), 1 mM dATP or 1 mM dADP, 0.5 mM Co 2+ and 0.1-2 ⁇ M Cp/ApdATPase was incubated for 1 hour at room temperature. Negative controls omitted Cp/ApdATPase.
  • the reaction mixture was then incubated in a boiling water bath for 3 minutes and the precipitated protein was removed by centrifugation (18000 xg, 5 minutes). The supernatant was filtered and loaded onto an Agilent 6230 TOF LC/MS instrument (Agilent Technologies, CA, USA) for ESI-MS analysis without liquid chromatography.
  • the sample loading volume was 5 ⁇ L, and the sample loading rate was 0.2 mL/min.
  • the reaction mixture was then processed by centrifuge concentrator (3K MwCO; Millipore) to remove enzymes and loaded onto an Agilent 6230 TOF LC/MS instrument (Agilent Technologies, CA, USA) for ESI-MS without LC column analyze.
  • the sample loading volume was 5 ⁇ L, and the sample loading rate was 0.2 mL/min.
  • reaction mixture containing 10 mM HEPES (pH 7.5), 0.5 mM dATP or dADP, 2 mM Mg 2+ , 2 mM Co 2+ , 0.2 ⁇ M Cp/Sp/ApdATPase and 0.3 ⁇ M inorganic pyrophosphatase 1 (IPP1) was added at room temperature Incubate for 1 h. Pyrophosphate (13) was quantified using a colorimetric phosphomolybdate assay.
  • reaction mixture containing 10 mM HEPES (pH 7.5), 0.5 mM dAMP, 2 mM Co 2+ and 1 ⁇ M Cp/Sp/ApdATPase was incubated for 1 hour at room temperature. Phosphate was quantified using a colorimetric phosphomolybdate assay (13, 42).
  • a 100 ⁇ L reaction mixture containing 10 mM HEPES (pH 7.5), 600-0 ⁇ M dATP/dADP/dAMP, 2 mM Mg 2+ , 2 mM Co 2+ , 0.04-0.8 ⁇ M Cp/Sp/ApdATPase and 0.3 ⁇ M IPP1 was incubated at room temperature 0-10 minutes. Triphosphate, pyrophosphate, and phosphate were quantified using a colorimetric phosphomolybdate assay (13, 42).
  • Point mutations in the SbPurZ active site were introduced by site-directed mutagenesis and confirmed by sequencing.
  • a 25 ⁇ L PCR reaction contained 50 ng pET28a(+)-SbPurZ plasmid as template, 0.4 ⁇ M forward and reverse primers (Table 6) and Fast Alteration DNA polymerase (KM101 from TIANGEN, Beijing, China).
  • the PCR reaction mixture was digested with DpnI for 17 cycles to remove template and then transformed into FDM competent cells (TIANGEN). SbPurZ muteins were expressed and purified as described for wild-type PurZ.
  • a 100 ⁇ L reaction mixture containing 10 mM HEPES (pH 7.5), 0.5 mM (d)NTP, 2 mM Mg 2+ , 2 mM Co 2+ , 0.2 ⁇ M ApDUF550 (no ApDUF550 added to the blank control) and 0.3 ⁇ M IPP1 was incubated at room temperature for 0.5 Hour. Pyrophosphate formation was quantified using a colorimetric phosphomolybdate assay (13).
  • a 100 ⁇ L reaction mixture containing 10 mM HEPES (pH 7.5), 0.5 mM dATP, 2 mM Co (or other metal ions), 0.3 ⁇ M IPP1 ( 2 mM Mg 2+ ) and 0.2 ⁇ M HMT-ApDUF550 was incubated for 0.5 h at room temperature .
  • Pyrophosphate (13) was quantified using a colorimetric phosphomolybdate assay.
  • the reaction system contains 20ng pET28a(+)-ApPurZ, 0.2mM dNTP(dATP) or 0.2mM dNTP(dZTP(Trilink N-2003-1)), 0.5 ⁇ M forward primer (ATGAAGAAGGCGACCGTTATTT), 0.5 ⁇ M reverse primer (TCAAGCAATGTTTGATGATTTGTTAT) , 1 ⁇ Q5 reaction buffer, and 1 ⁇ L of Q5 DNA polymerase (NEB M0491L). Reactions were performed by thermal cycling on a T100 thermal cycler (BIO-RAD) including initial denaturation at 98°C for 30 seconds and 30 cycles (98°C for 10 seconds, 60°C for 15 seconds and 72°C for 30 seconds).
  • PCR products were purified using the StarPrep Gel Extraction Kit (GenStar) following the manufacturer's instructions. Concentrations were then measured using NANODROP ONE (Thermo SCIENTIFIC) and purity assessed by agarose gel electrophoresis. 400ng of DNA was loaded on each lane of the agarose gel. Samples digested with restriction enzyme SspI (NEB R0132S) were also included in the gel analysis.
  • a 300 ⁇ L reaction mixture containing 20 mM HEPES (pH 8.5), 1 mM dGMP, 1 mM ATP, 5 mM Mg 2+ , 5 mM Asp-Na + , 5 ⁇ M Sp/ApPurZ and 5 ⁇ M EcPurB was incubated at room temperature for 4 hours or 0.5 hours.
  • the enzyme was removed from the reaction mixture by a centrifuge concentrator (3K MwCO; Millipore) and the product contained in the flow-through served as a substrate for SeGK. 5 ⁇ M SeGK, 1 mM ATP and 5 mM Mg 2+ were added and incubated for an additional 45 minutes at room temperature.
  • the reaction mixture was reapplied to a centrifuge concentrator (3KMWCO; Millipore), retaining the protein.
  • the flow through was then loaded onto an Agilent 6230TOF LC/MS instrument (Agilent Technologies, CA, USA) for ESI-MS analysis without a LC column.
  • the sample loading volume was 5 ⁇ L, and the sample was loaded with water at a loading rate of 0.2 mL/min.
  • Acinetobacter bacteriophage SH-Ab15497 (10 9 PFU/mL) of Acinetobacter baumannii strain 15497 in logarithmic growth phase (OD 600 0.6-0.8) was inoculated to LB soft agar overlay at a volume ratio of 1:4 ( 0.7% agar) in a 37°C incubator for 7-8 hours. Lysates were then collected in SM buffer (50 mM Tris-HCl (pH 7.5), 100 mM NaCl, 8 mM MgSO 4 and 0.1 g/L gelatin) and incubated overnight at 4°C in a 100 rpm shaker. Cell debris was removed by centrifugation at 10,000 xg, 4°C for 10 minutes using a centrifuge. The genomic DNA of phage SH-Ab 15497 was extracted using the lambda phage genomic DNA kit (Zoman Biotech, ZP317-1).
  • Genomic DNA from Acinetobacter phage SH-Ab 15497 was enzymatically digested under neutral conditions. Briefly, containing 5 ⁇ g phage genomic DNA or positive control (4 ⁇ g ApPurZ PCR product with Z) or negative control (4 ⁇ g ApPurZ PCR product with A), 2 ⁇ L DNase I (NEB0303AA), 0.008 units of phosphodiesterase I (Sigma) P3243), 2 ⁇ L of alkaline phosphatase (Takara 2120a) and 15 ⁇ L of a 150 ⁇ L mixture of 10 ⁇ reaction buffer (500 mM Tris-HCl (pH 7.5), 100 mM NaCl, 10 mM MgCl 2 and 10 mM ZnSO 4 ) were incubated overnight at 37 °C, And applied to a centrifugal concentrator (3K MwCO; Millipore) for centrifugation.
  • 3K MwCO centrifugal concentrator
  • the flow through was analyzed using an Agilent 6420 Triple Quadrupole LC/MS instrument (Agilent Technologies, CA, USA). LC separations were performed on a Syncronis aQ (150 mm*4.6 mm, 3 ⁇ m) column at a flow rate of 0.5 mL/min, room temperature. 10 mM NH4AC (pH 4.6 ) in water (solvent A) and methanol (solvent B) was used as mobile phase. A gradient of 20-32% solvent B from 0-12 minutes was used. The sample loading volume was 10 ⁇ L and the UV detector was set at 260 nm.
  • Standard compounds include commercially available deoxynucleosides (dA, dT, dC, and dG) and homemade dZ (dZTP hydrolyzate of alkaline phosphatase (Takara 2120a), incubated at 37°C for 1 hr, then centrifuged to retain the enzyme) . 2 ⁇ L of each standard compound (approximately 0.1 ⁇ g) was loaded.
  • ESI-enzyme-digested phage genomic DNA samples (10 ⁇ L) were subjected to ESI-lysis using a Q Exactive TM HF/UltiMate TM 3000RSLCnano (Thermo Fisher) instrument equipped with the same LC column using the same elution protocol at a flow rate of 0.4 mL/min. MS/MS analysis. Mass detection was performed in positive electrospray ionization (ESI) mode.
  • ESI positive electrospray ionization
  • Restriction endonuclease digestion was performed on 0.2-0.3 ⁇ g of phage genomic DNA or Z-containing ApPurZ PCR product in a 20 ⁇ L reaction mixture according to the manufacturer’s instructions.
  • a library of native DNA extracted from Acinetobacter phage SH-Ab 15497 and a Z/A-containing ApPurZ PCR product library was constructed with SQK-LSK109 (version 2019). The library was then loaded into a flow cell FLO-MIN106 (R9.4, Oxford Nanopore Technologies) with over 1100 active single wells. MinION Mk1B sequencer was used for sequencing. Guppy 3.2.10 integrated in MinKNOW 3.6.5 was used for base calling.
  • the raw signal for each read was extracted from the fast5 file via ont_fast5_api and long (longer than 100) low variance regions were removed.
  • the corresponding NGS version of the phage DNA sequence used for each read was extracted from the BLASTn results above.
  • the original signal was then aligned to the phage genomic DNA sequence using cwDTW (21). Based on the alignment, a cwDTW score (absolute difference between normalized original signal and normalized expected signal) was generated for each hexamer in the reads (21). Based on the cwDTW scores of the hexamers, the median normalized signal for each hexamer across all reads was extracted to generate a customized mapping table.
  • the custom mapping table is then used by cwDTW to regenerate a new alignment file.
  • the cwDTW score distribution of the phage reads was similar to that of the host reads, indicating the validity of the final mapping table.
  • Kirnos et al. in 1977 reported the complete replacement of A by Z in the genome of the cyanobacterial phage S-2L (1). When Z base pairs with T, three hydrogen bonds (1, 2) are formed. It differs from other types of nucleobase modifications (3) in the unique role of Z in altering Watson-Crick base pairing, including altering the physical, chemical, and mechanical properties of double-stranded DNA (2, 4-6). ). However, there has not been any report on the biochemical characterization of the enzymes required for the synthesis of Z-containing genomes. Understanding Z-genome-containing biosynthesis will help to study the distribution of Z-genome-containing species.
  • PurA adenylate succinate synthase
  • PurZ Figure 4
  • PurA and adenylate succinate lyase are known to catalyze the conversion of inosine 5'-monophosphate (IMP) to adenosine 5'-monophosphate (AMP) in many organisms (Panel A in Figure 1) (9).
  • IMP inosine 5'-monophosphate
  • AMP adenosine 5'-monophosphate
  • the present application proposes and demonstrates that PurZ participates in a similar reaction to provide Z nucleotides (panel A in Figure 1).
  • PurA's substrates are IMP, guanosine 5'-triphosphate (GTP) and Asp, and PurA catalyzes the transfer of GTP ⁇ -phosphate to IMP followed by displacement of phosphate with aspartate to form adenylate succinate (9) .
  • the catalytic residue Asp13(D) in the PurA GDxxKG motif was replaced by a Ser(S) residue in PurZ ( Figure 1, panels D and E, and Figure 6).
  • Molecular modeling suggests that replacing Asp with a smaller Ser can accommodate the additional 2-amino group of the substrate (panel D in Fig. 1) and may also alter the catalytic mechanism (Fig. 7), since Asp is removed from the IMP in PurA Protons (9-11) were extracted.
  • the inventors screened a variety of PurZ homologs, and selected four (Ap/Sp/Vp/SbPurZ, SEQ ID NOs: 1-4) ( Figure 9, Figure 10 and Table 1), and identified them with 2'-deoxyguanosine. 5'-monophosphate (dGMP), adenine 5'-triphosphate (ATP) and Asp were used as substrates (panel A in Figure 1) to detect their enzymatic activities. The results were consistent with the inventors' homology model (D and F panels in Figure 1). Ap, Sp and VpPurZ were from phage isolates, and SbPurZ in metagenomic contigs should also be of phage origin (Table 2).
  • the kcat ranged from 2.3-16.5 min -1 for the four enzymes, and the KMs were 1.6-5.1 and 4.1-21.1 [mu] M for dGMP and ATP, respectively ( Figure 11 and Table 3).
  • the apparent KM of dGMP is much lower than the previously reported intracellular concentration of dGMP in bacteria (about 50 ⁇ M ) (14).
  • the k cat /KM of dIMP was significantly (15-fold) lower than that of the physiological substrate dGMP .
  • reaction intermediates and products of PurZ were confirmed by electrospray ionization tandem mass spectrometry (ESI-MS/MS, B-D in Figure 2 and I-L in Figure 10).
  • ESI-MS/MS electrospray ionization tandem mass spectrometry
  • ADAS 2-aminodeoxyadenosuccinate
  • the inventors performed mutagenesis of substrate-interacting residues in SbPurZ to the corresponding residues in EcPurA. All mutations impair PurZ activity to varying degrees (Table 4). Notably, the S15D mutation completely abolished the activity of SbPurZ, consistent with a role for S15 in accommodating the 2-amino group of dGMP (D in Figure 1).
  • the T274G mutant showed a 271-fold increase in KM for Asp, consistent with a role for T274 in Asp binding (Figure 7).
  • the inventors examined the genomic environment of PurZ and identified a DNA polymerase, an HD domain-containing hydrolase-like enzyme and a DUF550 domain-containing protein (A in Figure 3).
  • HD domain enzymes also referred to herein as dATPases
  • dATPases are present in the genomes of 20 PurZ-containing phages (Fig. 3, A). These HD domain enzymes have highly conserved metal and ligand binding pockets, but their sequences are highly diverse ( Figure 3A and Figure 13).
  • the inventors prepared recombinant Cp/Ap/Sp HD enzymes (SEQ ID NOs: 5-7) (FH in Figure 9). Although they share only 24-34% sequence identity (Fig.
  • the enzyme is highly specific for dATP with much lower hydrolytic activity for NTP and other dNTPs (C in Figure 3).
  • the enzyme also catalyzes the hydrolysis of 2'-deoxyadenine 5'-diphosphate (dADP) and 2'-deoxyadenine 5'-monophosphate (dAMP) to dA, releasing pyrophosphate and phosphate, respectively ( Figures 15, 16). ).
  • dATPases can promote Z-containing gene group synthesis by specifically removing dATP and its precursor dADP from the host's nucleotide pool (16), thereby preventing the incorporation of A into the phage genome.
  • a DUF550 domain-containing protein is also of interest because it coexists with PurZ.
  • Recombinant ApDUF550 (SEQ ID NO: 8) displays dATP and 2' -deoxyguanosine 5'-triphosphate (dGTP) pyrophosphohydrolase activity, catalyzing the hydrolysis of dATP/dGTP to pyrophosphate and dAMP/dGMP, respectively, using Co + The highest activity was obtained as a divalent metal cofactor, with little or no activity against NTPs and pyrimidine dNTPs (D in Figure 3 and Figure 17).
  • this DUF550-containing enzyme may act to provide dGMP as a PurZ substrate, increasing dZTP levels, while depleting dATP to further facilitate Z incorporation (Fig. 3B).
  • PurZ and other genes involved in dZTP biosynthesis and genome incorporation provides a basis for studying the occurrence of Z-containing genomes in nature.
  • Predicted PurZ sequences included 60 sequences from phage isolates and 13 sequences from phage contigs in the metagenome ( Figure 1B and Table 1).
  • PurZ-containing phages mainly belong to the Podoviridae and Siphoviridae families (B in Figure 1). In the post-genomic era, where chemical determination of nucleotide content or base composition is no longer routine, the possibility that these phages contain modified purines in their DNA may be overlooked.
  • Phage DNA was prepared and digested with a combination of DNase I, phosphodiesterase I, and alkaline phosphatase, followed by LC-UV spectrometry and LC-MS/MS analysis ( Figure 4, A and B).
  • phage DNA extracts will contain more or less varying amounts of host DNA fragments
  • the inventors performed nanopore sequencing of crude extracts to show the quality of phage reads (by Q-score, read identity and cwDTW Alignment fraction quantification) was similar in quality to the Z-DNA control, while the quality of the host reads was similar to that of normal DNA (C-E in Figure 4 and Figures 20, 21) (18-20).
  • the nanopore signal was assigned approximately 100M reads each for phage and host DNA, which allowed the inventors to perform a statistical analysis of the data and conclude that phage DNA is virtually free of adenine, while Z is unlikely to be incorporated into host DNA .
  • ADAS 2-aminodeoxyadenosuccinate
  • ADAS 2-aminodeoxyadenosuccinate
  • ADAS 2-aminodeoxyadenosuccinate
  • GpPurZ0 (species: Gordonia phage Archimedes (Gordonia phage Archimedes), Uniprot ID: A0A7L7SI10, SEQ ID NO: 121)) was used as an example to characterize and test PurZ0.
  • the protein obtained from the TALON column was dialyzed in 2 L buffer [20 mM Tris ⁇ HCl, pH 7.5, 5 mM BME] for 3 hours, and then used a 10 mL Q Sepharose anion exchange column. Elute with a linear gradient of buffer containing 300 to 700 mM KCl. The protein containing the significant peak of GpPurZ0 was collected and concentrated to a final volume of 5 mL using a centrifugal concentrator (30K MWCO; Millipore).
  • the protein solution was then injected into a Superdex 200 molecular sieve column (300 mL) pre-equilibrated with buffer [20 mM Tris ⁇ HCl, pH 7.5, 0.2 M KCl, 1 mM dithiothreitol] and eluted with buffer B.
  • the gel filtration column was re-concentrated and buffer exchanged with storage buffer [10 mM HEPES/KOH, pH 7.4, 50 mM KCl, 1 mM Tris-(2-carboxyethyl)-phosphine hydrochloride]. The final concentration of protein was adjusted to 10 mg/mL.
  • Preliminary screening of GpPurZ0 crystals was performed by the sitting drop method in a 96-well plate using a crystallization robotic system (Gryphon, Art Robbins).
  • the optimal conditions to produce crystals were 1.26M ammonium sulfate and 0.2M lithium sulfate, 0.1M Tris pH 8.5 and 5mM GTP.
  • SSRF 18U Shanghai Synchrotron Radiation Light Source
  • Data were processed using HKL3000 software.
  • Molecular replacement was performed on PHENIX software using the crystal structure model created with the website PHYRE2.
  • the structure was manually constructed using Coot software according to the electron cloud orientation, and further optimized in PHENIX software, which was then uploaded to the RCSB protein database (accession code 7VF6).
  • the appendix table contains data collection and final optimized crystal structure data (Table 7). All structural maps were generated using UCSF Chimera ( Figure 25).
  • PurZ and PurZ0 are the same in the dGMP active site, both are GSxxKG (where x represents any amino acid residue, usually located at the 13-18 position of the amino acid sequence); but in ATP/dATP and GTP /dGTP active sites are different, specifically, the ATP/dATP active site of PurZ is NxxN/Q (where x represents any amino acid residue, usually located around position 300 in the amino acid sequence), PurZ0 GTP/ The active site of dGTP is T/SxxD (where x represents any amino acid residue, usually located before and after the 300th position of the amino acid sequence), and PurZ and PurZ0 can be distinguished according to the above structural properties.
  • the gene sequences of ApPurZ and dATPase were recombinantly cloned into plasmid pRS426 (uracil-deficient) by the method of homologous recombination, and the gene sequence of DUF550 (see Example 1) was also cloned into the plasmid. on plasmid pRS425 (leucine deficient) ( Figure 26). After the genes were sequenced correctly, the two plasmids were transformed into yeast and plated on yeast synthetic solid medium deficient in uracil and leucine.
  • Cyanophage S-2L contains DNA with 2,6-diaminopurine substituted for adenine. Virology 88, 8-18 (1978).

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Microbiology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biochemistry (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Medicinal Chemistry (AREA)
  • Biomedical Technology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Veterinary Medicine (AREA)
  • Public Health (AREA)
  • Animal Behavior & Ethology (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Virology (AREA)
  • Epidemiology (AREA)
  • Mycology (AREA)
  • Biophysics (AREA)
  • Nutrition Science (AREA)
  • Immunology (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Communicable Diseases (AREA)
  • Food Science & Technology (AREA)
  • Polymers & Plastics (AREA)
  • Physics & Mathematics (AREA)
  • Oncology (AREA)
  • Plant Pathology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Enzymes And Modification Thereof (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

Provided are an enzyme involved in phage diaminopurine synthesis, and a use of the enzyme. The enzyme comprises 2-amino deoxyadenosine succinate (ADAS) synthetase, 2'-deoxyadenine-5'-triphosphate triphosphohydrolase (dATPase), dATP, and dGTP pyrophosphohydrolase.

Description

参与噬菌体二氨基嘌呤合成的酶及其应用Enzymes involved in phage diaminopurine synthesis and their applications
相关申请Related applications
本申请要求2021年1月14日提交的中国专利申请202110045505.X号的优先权,通过引用的方式将该申请的全部内容整体并入本文,用于所有目的。This application claims priority to Chinese Patent Application No. 202110045505.X filed on January 14, 2021, the entire contents of which are incorporated herein by reference for all purposes.
发明领域Field of Invention
本申请大体涉及生物化学和分子生物学领域,具体而言,本申请揭示了具有含二氨基嘌呤(也称为2,6-二氨基嘌呤、2-氨基腺嘌呤或Z碱基)基因组的噬菌体的Z碱基合成通路、参与的酶以及这些酶的应用。The present application relates generally to the fields of biochemistry and molecular biology, and in particular, the present application discloses bacteriophages having a diaminopurine (also known as 2,6-diaminopurine, 2-aminoadenine, or Z base)-containing genome The Z base synthesis pathway, the enzymes involved and the applications of these enzymes.
发明背景Background of the Invention
Kirnos等在1977年报道了在蓝细菌噬菌体(cyanophage,本文简写为Cp)S-2L的基因组中腺嘌呤(A)被Z完全取代(1),但是对于这种基因组合成的通路目前还没有揭示。鉴于Z碱基作为一种新的碱基类型具有重要的潜在价值和用途,对其合成路径的研究具有科研和工业应用意义。Kirnos et al. reported in 1977 that adenine (A) was completely replaced by Z in the genome of cyanophage (cyanophage, abbreviated as Cp in this paper) S-2L (1), but the pathway for this combination of genes has not yet been revealed. . In view of the important potential value and application of Z base as a new type of base, the research on its synthesis pathway is of great significance for scientific research and industrial application.
发明概述SUMMARY OF THE INVENTION
第一方面,本申请提供了一种多肽,所述多肽具有2-氨基脱氧腺苷酸琥珀酸酯(ADAS)合成酶活性,能够以ATP、dATP、GTP、dGTP中的一种或多种以及dGMP和Asp为底物催化形成2-氨基脱氧腺苷酸琥珀酸酯,并且相比于腺苷酸琥珀酸酯合成酶(PurA)的GDxxKG催化基序,所述多肽的催化基序改变为GSxxKG,其中x代表任意氨基酸残基。In a first aspect, the present application provides a polypeptide having 2-aminodeoxyadenosuccinate (ADAS) synthase activity, capable of producing one or more of ATP, dATP, GTP, dGTP, and dGMP and Asp are substrates to catalyze the formation of 2-aminodeoxyadenosuccinate succinate, and compared with the GDxxKG catalytic motif of adenylate succinate synthase (PurA), the catalytic motif of the polypeptide is changed to GSxxKG , where x represents any amino acid residue.
在本申请鉴定的多种2-氨基脱氧腺苷酸琥珀酸酯(ADAS)合成酶的底物中,dGMP和Asp为共同底物,ATP、dATP、GTP、dGTP为可变底物,其中在本文中,以ATP/dATP作为底物的2-氨基脱氧腺苷酸琥珀酸酯(ADAS)合成酶称为PurZ(PurZ是首先被鉴定出来的),以GTP/dGTP作为底物的2-氨基脱氧腺苷酸琥珀酸酯(ADAS)合成酶称为PurZ0(PurZ0是在PurZ被鉴定出来的后续研究细分出的)。Among the substrates of various 2-aminodeoxyadenosuccinate (ADAS) synthases identified in this application, dGMP and Asp are common substrates, and ATP, dATP, GTP, and dGTP are variable substrates. In this paper, the 2-aminodeoxyadenosuccinate (ADAS) synthase using ATP/dATP as substrate is called PurZ (PurZ was first identified), and the 2-aminodeoxyadenosuccinate (ADAS) synthase using GTP/dGTP as substrate The deoxyadenylate succinate (ADAS) synthase is called PurZ0 (PurZ0 was subdivided in subsequent studies where PurZ was identified).
在一些实施方案中,当所述多肽与来自大肠杆菌的腺苷酸琥珀酸酯合成酶(SEQ ID NO:72)的氨基酸序列比对时,对应于SEQ ID NO:72的303位由R改变为L。In some embodiments, position 303 corresponding to SEQ ID NO:72 is changed by R when the polypeptide is aligned with the amino acid sequence of adenylate succinate synthase from E. coli (SEQ ID NO:72) for L.
在一些实施方案中,当所述多肽与SEQ ID NO:2所示的氨基酸序列比对时,具有对应于SEQ ID NO:2的274位的T、306位的N、307位的F和309位的N。In some embodiments, the polypeptide has T, N at position 306, F at position 307, and 309 corresponding to SEQ ID NO: 2 when aligned with the amino acid sequence set forth in SEQ ID NO: 2 bit N.
在一些实施方案中,所述多肽包含SEQ ID NO:1-4、9-69、71和92-146中任一项所示的序列,或SEQ ID NO:1-4、9-69、71和92-146中任一项所示的序列具有一个或多个氨基酸的插入、缺失和/取代的保留2-氨基脱氧腺苷酸琥珀酸酯(ADAS)合成酶活性的变体,或SEQ ID NO:1-4、9-69、71和92-146中任一项所示的序列的保留催化基序的片段。In some embodiments, the polypeptide comprises the sequence set forth in any one of SEQ ID NOs: 1-4, 9-69, 71, and 92-146, or SEQ ID NOs: 1-4, 9-69, 71 and a variant of the sequence shown in any one of 92-146 having one or more amino acid insertions, deletions and/or substitutions that retain 2-aminodeoxyadenosuccinate (ADAS) synthetase activity, or SEQ ID Fragments of the sequences set forth in any of NO: 1-4, 9-69, 71 and 92-146 that retain the catalytic motif.
第二方面,本申请提供了一种多肽,所述多肽具有2’-脱氧腺嘌呤5’-三磷酸三磷酸水解酶(dATPase)活性,能够催化dATP水解生成2’-脱氧腺嘌呤(dA),所述多肽包含金属和配体结合口袋。In a second aspect, the present application provides a polypeptide, which has 2'-deoxyadenine 5'-triphosphate triphosphate hydrolase (dATPase) activity and can catalyze the hydrolysis of dATP to generate 2'-deoxyadenine (dA) , the polypeptide contains metal and ligand binding pockets.
在一些实施方案中,所述多肽还能催化dADP和dAMP水解生成2’-脱氧腺嘌呤(dA)。In some embodiments, the polypeptide also catalyzes the hydrolysis of dADP and dAMP to 2'-deoxyadenine (dA).
在一些实施方案中,所述多肽包含Co 2+作为二价金属辅因子。 In some embodiments, the polypeptide comprises Co 2+ as a divalent metal cofactor.
在一些实施方案中,所述多肽包含SEQ ID NO:5-7和73-91中任一项所示的序列,或SEQ ID NO:5-7和73-91中任一项所示的序列具有一个或多个氨基酸的插入、缺失和 /取代的保留dATPase活性的变体,或SEQ ID NO:5-7和73-91中任一项所示的序列的保留催化基序的片段。In some embodiments, the polypeptide comprises the sequence set forth in any one of SEQ ID NOs: 5-7 and 73-91, or the sequence set forth in any one of SEQ ID NOs: 5-7 and 73-91 A variant having one or more amino acid insertions, deletions and/or substitutions that retains dATPase activity, or a fragment of the sequence set forth in any of SEQ ID NOs: 5-7 and 73-91 that retains the catalytic motif.
第三方面,本申请提供了一种多肽,所述多肽具有dATP和dGTP焦磷酸水解酶活性,能够催化dATP水解成dAMP以及催化dGTP水解成dGMP,所述多肽包含金属和配体结合口袋。In a third aspect, the application provides a polypeptide having dATP and dGTP pyrophosphohydrolase activities, capable of catalyzing the hydrolysis of dATP to dAMP and dGTP to dGMP, the polypeptide comprising a metal and ligand binding pocket.
在一些实施方案中,所述多肽包含Co 2+作为二价金属辅因子。 In some embodiments, the polypeptide comprises Co 2+ as a divalent metal cofactor.
在一些实施方案中,所述多肽包含SEQ ID NO:8所示的序列,或SEQ ID NO:8所示的序列具有一个或多个氨基酸的插入、缺失和/取代的保留dATP和dGTP焦磷酸水解酶活性的变体,或SEQ ID NO:8所示的序列的保留催化基序的片段。In some embodiments, the polypeptide comprises the sequence set forth in SEQ ID NO: 8, or the sequence set forth in SEQ ID NO: 8 with one or more amino acid insertions, deletions and/or substitutions retained dATP and dGTP pyrophosphate A variant of hydrolase activity, or a fragment of the sequence set forth in SEQ ID NO:8 that retains the catalytic motif.
第一至第三方面的多肽均可以参与噬菌体的Z碱基合成通路,其中第一方面所述的多肽是最为关键的一种。The polypeptides of the first to third aspects can all participate in the Z base synthesis pathway of phage, of which the polypeptide described in the first aspect is the most critical one.
第四方面,本申请提供了编码第一至第三方面所述的多肽的核酸分子。In a fourth aspect, the present application provides nucleic acid molecules encoding the polypeptides of the first to third aspects.
第五方面,本申请提供了包含第四方面所述的核酸分子的载体。In a fifth aspect, the present application provides a vector comprising the nucleic acid molecule of the fourth aspect.
第六方面,本申请提供了改造噬菌体的方法,包括向噬菌体的基因组引入编码第一方面所述的多肽的核酸分子,以所述噬菌体表达第一方面所述的多肽。In a sixth aspect, the present application provides a method for modifying a phage, comprising introducing a nucleic acid molecule encoding the polypeptide of the first aspect into the genome of the phage, and expressing the polypeptide of the first aspect by the phage.
在一些实施方案中,方法还包括向所述噬菌体的基因组引入编码第二方面所述的多肽的核酸分子,以所述噬菌体表达第二方面所述的多肽;和/或向所述噬菌体的基因组引入编码第三方面所述的多肽的核酸分子,以所述噬菌体表达第三方面所述的多肽。In some embodiments, the method further comprises introducing into the genome of the bacteriophage a nucleic acid molecule encoding the polypeptide of the second aspect, expressing the polypeptide of the second aspect with the bacteriophage; and/or into the genome of the bacteriophage A nucleic acid molecule encoding the polypeptide of the third aspect is introduced, and the polypeptide of the third aspect is expressed by the phage.
在一些实施方案中,方法还包括向所述噬菌体的基因组引入编码腺苷酸琥珀酸裂解酶(PurB)的核酸分子,以所述噬菌体表达腺苷酸琥珀酸裂解酶(PurB);和/或向所述噬菌体的基因组引入编码GMP激酶(GK)的核酸分子,以所述噬菌体表达GMP激酶(GK)。In some embodiments, the method further comprises introducing a nucleic acid molecule encoding an adenosuccinate lyase (PurB) into the genome of the bacteriophage, expressing the adenosuccinate lyase (PurB) with the bacteriophage; and/or A nucleic acid molecule encoding GMP kinase (GK) is introduced into the genome of the phage to express GMP kinase (GK) from the phage.
第七方面,本申请提供了通过第六方面所述的方法获得的噬菌体。In a seventh aspect, the present application provides a phage obtained by the method of the sixth aspect.
第八方面,本申请提供了包含第七方面所述的噬菌体的宿主细胞。In an eighth aspect, the present application provides a host cell comprising the phage of the seventh aspect.
在一些实施方案中,宿主细胞为细菌细胞。In some embodiments, the host cell is a bacterial cell.
第九方面,本申请提供了第七方面所述的噬菌体或第八方面所述的宿主细胞在二氨基嘌呤脱氧核糖核苷酸(dZTP)合成、DNA合成、DNA折纸、基于DNA的数据存储、抗菌药物制备、杀菌剂制备或防腐剂制备中的用途。In a ninth aspect, the application provides the phage of the seventh aspect or the host cell of the eighth aspect in diaminopurine deoxyribonucleotide (dZTP) synthesis, DNA synthesis, DNA origami, DNA-based data storage, Use in the preparation of antibacterials, bactericides or preservatives.
附图简要说明Brief Description of Drawings
图1.PurZ的序列和结构分析。A:AMP和dZMP的同等生物合成途径;B:推定的PurZ的序列相似性网络;C:SbPurZ的同源性模型;D:SbPurZ和EcPurA中的NMP结合位点;E:NMP结合位点的序列标识;F:PurZ/EcPurA的NTP结合位点中的嘌呤相互作用残基;G:NTP结合位点的序列标识;残基编号来自EcPurA和Cp/SbPurZ。Figure 1. Sequence and structural analysis of PurZ. A: Equivalent biosynthetic pathways of AMP and dZMP; B: Sequence similarity network of putative PurZ; C: Homology model of SbPurZ; D: NMP binding site in SbPurZ and EcPurA; E: NMP binding site Sequence ID; F: Purine-interacting residues in the NTP binding site of PurZ/EcPurA; G: Sequence ID of the NTP binding site; residue numbers from EcPurA and Cp/SbPurZ.
图2.SbPurZ、ApdATPase和ApDUF550的底物特异性。A:SbPurZ的底物特异性;B:SbPurZ、dGMP、ATP和Asp的完全反应;C:Asp、dGMP和ATP的反应,省去SbPurZ;D:联合SbPurZ和EcPurB的两步反应。Figure 2. Substrate specificity of SbPurZ, ApdATPase and ApDUF550. A: substrate specificity of SbPurZ; B: complete reaction of SbPurZ, dGMP, ATP and Asp; C: reaction of Asp, dGMP and ATP, omitting SbPurZ; D: two-step reaction combining SbPurZ and EcPurB.
图3.噬菌体含Z基因组的推测生物合成途径。A:噬菌体中PurZ的基因组邻域;B:含Z基因组的生物合成途径;C:ApdATPase的底物特异性;D:ApDUF550。Figure 3. Putative biosynthetic pathways of phage-containing Z genomes. A: Genomic neighborhood of PurZ in phage; B: Biosynthetic pathway containing Z genome; C: Substrate specificity of ApdATPase; D: ApDUF550.
图4.不动杆菌噬菌体SH-Ab 15497基因组中Z碱基并入的验证。噬菌体DNA水解产物中dZ的LC-UV检测(A)和提取离子色谱图(B);宿主、噬菌体、含A和Z的ApPurZ PCR产物与相应NGS序列Read Q评分(C)和读值同一性(D);宿主和噬菌体读值的cwDTW分数分布(E);噬菌体基因组DNA的限制酶消化(F);SspI能消化含A的ApPurZ PCR产物,但不能消化含Z的ApPurZ PCR产物(G)。Figure 4. Validation of Z base incorporation in the genome of Acinetobacter phage SH-Ab 15497. LC-UV detection (A) and extracted ion chromatogram (B) of dZ in phage DNA hydrolyzates; Read Q score (C) and read identity of host, phage, ApPurZ containing A and Z PCR products and corresponding NGS sequences (D); cwDTW fraction distribution of host and phage reads (E); restriction enzyme digestion of phage genomic DNA (F); SspI digests A-containing ApPurZ PCR products but not Z-containing ApPurZ PCR products (G) .
图5.用于本申请的示例性CpPurZ(PurA同系物)和CpdATPase的蛋白序列。两个CpPurZ序列的差异以下划线表示。Figure 5. Protein sequences of exemplary CpPurZ (PurA homolog) and CpdATPase used in the present application. Differences between the two CpPurZ sequences are underlined.
图6.五种不同PurZ和EcPurA的多序列比对。dGMP、ATP和Asp结合残基分别以灰度背景突出。本申请中构建的SBPurZ上的突变位置用框表示。Figure 6. Multiple sequence alignment of five different PurZ and EcPurA. dGMP, ATP and Asp binding residues are highlighted in grey background, respectively. Mutation positions on SBPurZ constructed in this application are indicated by boxes.
图7.SbPurZ的推测催化机理。源自dGMP、Asp和ATP的原子分别以不同灰度突出。Figure 7. The putative catalytic mechanism of SbPurZ. Atoms derived from dGMP, Asp and ATP are highlighted in different shades of gray, respectively.
图8.PurA超家族的序列相似性网络。网络以10 -160的E值显示,其中每个节点表示序列相同性≥85%的序列。 Figure 8. Sequence similarity network of the PurA superfamily. Networks are shown with E-values of 10–160 , where each node represents sequences with ≥85% sequence identity.
图9.本申请中纯化的重组蛋白的SDS-PAGE凝胶分析。A-J图显示4-20%SDS凝胶,泳道1为分子量标记,泳道2、3和4为1、2、4μg纯化的蛋白质。ApdATPase是含有N末端MBP的融合蛋白。Figure 9. SDS-PAGE gel analysis of recombinant proteins purified in this application. Panels A-J show 4-20% SDS gels with molecular weight markers in lane 1 and 1, 2, 4 μg of purified protein in lanes 2, 3 and 4. ApdATPase is a fusion protein containing an N-terminal MBP.
图10.PurZ酶活性测定。A:SbPurZ测定的时间依赖性UV-Vis光谱;B-D:ApPurZ、SpPurZ和VpPurZ测定的UV-Vis光谱;E:SbPurZ测定中的UV-Vis差异光谱,从在不同时间点收集的每个光谱中减去在时间0处的UV-Vis光谱;F:SbPurZ催化的时间依赖性磷酸根释放,使用磷钼酸盐测定检测。G:SbPurZ活性的pH依赖性;H:测定ADAS在287nm的消光系数(ε);I:无Asp情况下(即阴性对照),ESI-MS检测SbPurZ的6-磷酰基-dGMP形成;J:在完全SbPurZ反应混合物中和在省略Asp的反应混合物中释放的磷酸根定量;K:纯dGMP、ATP和ADP的质谱分析;L:图2的B和D中的ADAS、dZMP和dGMP的MS2数据。Figure 10. PurZ enzyme activity assay. A: Time-dependent UV-Vis spectra of the SbPurZ assay; B-D: UV-Vis spectra of the ApPurZ, SpPurZ and VpPurZ assays; E: UV-Vis difference spectra of the SbPurZ assay, from each spectrum collected at different time points UV-Vis spectra at time 0 were subtracted; F: SbPurZ-catalyzed time-dependent phosphate release, detected using a phosphomolybdate assay. G: pH dependence of SbPurZ activity; H: Determination of the extinction coefficient (ε) of ADAS at 287 nm; I: ESI-MS detection of 6-phosphoryl-dGMP formation of SbPurZ in the absence of Asp (ie, negative control); J: Quantification of phosphate released in the complete SbPurZ reaction mixture and in the reaction mixture with Asp omitted; K: mass spectrometry analysis of pure dGMP, ATP and ADP; L: MS2 data of ADAS, dZMP and dGMP in Figure 2, B and D .
图11.PurZ的酶动力学分析。A:SbPurZ;B:SpPurZ;C:VpPurZ;D:ApPurZ;E:SbPurZ。在A-D中,使用dGMP、ATP和Asp作为底物,在E中,使用dIMP、ATP、Asp作为底物。Figure 11. Enzyme kinetics analysis of PurZ. A: SbPurZ; B: SpPurZ; C: VpPurZ; D: ApPurZ; E: SbPurZ. In A-D, dGMP, ATP, and Asp were used as substrates, and in E, dIMP, ATP, Asp were used as substrates.
图12.Sp/ApPurZ-EcPurB-SeGK反应的ESI-MS分析。A:SpPurZ、dGMP、ATP、Asp、EcPurB和SeGK的完全反应,显示产物dZDP和dZTP的形成;B:ApPurZ、dGMP、ATP、Asp、EcPurB和SeGK的完全反应,显示产物dZDP和dZTP的形成。Figure 12. ESI-MS analysis of the Sp/ApPurZ-EcPurB-SeGK reaction. A: Complete reaction of SpPurZ, dGMP, ATP, Asp, EcPurB, and SeGK, showing formation of products dZDP and dZTP; B: Complete reaction of ApPurZ, dGMP, ATP, Asp, EcPurB, and SeGK, showing formation of products dZDP and dZTP.
图13.dATPase的序列及结构分析。A:dATPase同系物中的金属和dATP结合基序;B:与dATP对接的CpdATPase的同源模型。碱基和金属结合残基分别标记为灰度。Figure 13. Sequence and structural analysis of dATPases. A: Metal and dATP binding motifs in dATPase homologues; B: Homology model of CpdATPase docked to dATP. Bases and metal-binding residues are marked in grayscale, respectively.
图14.通过CpdATPase和ApdATPase的dA形成的ESI-MS检测。A:dATP参与CpdATPase反应的ESI(+)m/z光谱;B:dADP参与CpdATPase反应的ESI(+)m/z光谱。C:dATP参与ApdATPase反应的ESI(+)m/z光谱;D:dADP参与ApdATPase反应的ESI(+)m/z光谱。Figure 14. ESI-MS detection of dA formation by CpdATPase and ApdATPase. A: ESI(+)m/z spectrum of dATP involved in CpdATPase reaction; B: ESI(+)m/z spectrum of dADP involved in CpdATPase reaction. C: ESI(+) m/z spectrum of dATP involved in ApdATPase reaction; D: ESI(+) m/z spectrum of dADP involved in ApdATPase reaction.
图15.dATPase的酶活性及酶动力学测定。A:磷酸根、焦磷酸根和三磷酸根的标准曲线。B:Cp/SpdATPase的金属依赖性比色磷酸根测定结果;C:Cp/SpdATPase的底物特异性;D-F:dATP、dADP或dAMP作为底物的Cp/Sp/ApdATPase的动力学常数。Figure 15. Enzymatic activity and enzyme kinetics determination of dATPase. A: Standard curves for phosphate, pyrophosphate and triphosphate. B: Metal-dependent colorimetric phosphate assay results of Cp/SpdATPase; C: Substrate specificity of Cp/SpdATPase; D-F: Kinetic constants of Cp/Sp/ApdATPase with dATP, dADP or dAMP as substrate.
图16.Cp/Sp/ApdATPase的比色磷酸根测定。A-C:Cp/Sp/ApdATPase分别与dATP、dADP和dAMP的反应。Figure 16. Colorimetric phosphate assay of Cp/Sp/ApdATPase. A-C: Reaction of Cp/Sp/ApdATPase with dATP, dADP and dAMP, respectively.
图17.ApDUF550催化反应的比色和质谱表征。A:纯化的ApDUF550的14-20%SDS凝胶结果,其中泳道1为分子量标记,泳道2-4为1、2、4μg的纯化的ApDUF550;B:ApDUF550的金属依赖性比色磷酸根测定结果;C和D:dATP和dGTP参与ApDUF550催化反应的ESI(-)m/z光谱,分别形成dAMP和dGMP;阴性对照省略ApDUF550。Figure 17. Colorimetric and mass spectrometric characterization of ApDUF550-catalyzed reactions. A: 14-20% SDS gel results of purified ApDUF550, wherein lane 1 is the molecular weight marker, and lanes 2-4 are 1, 2, and 4 μg of purified ApDUF550; B: Metal-dependent colorimetric phosphate assay results of ApDUF550 ; C and D: ESI(-) m/z spectra of dATP and dGTP involved in the reaction catalyzed by ApDUF550 to form dAMP and dGMP, respectively; ApDUF550 was omitted for negative control.
图18.具有A和Z的ApPurZ PCR产物。A:含有A和Z的ApPurZ PCR产物的UV-Vis光谱;B:含A和Z的ApPurZ PCR产物的琼脂糖凝胶分析。Figure 18. ApPurZ PCR product with A and Z. A: UV-Vis spectrum of ApPurZ PCR product containing A and Z; B: Agarose gel analysis of ApPurZ PCR product containing A and Z.
图19.不动杆菌噬菌体SH-Ab 15497基因组DNA中dZ和Z的LC-MS/MS检测。Figure 19. LC-MS/MS detection of dZ and Z in Acinetobacter phage SH-Ab 15497 genomic DNA.
图20.Z取代时纳米孔信号的变化。A:包含A的ApPurZ读值匹配预期信号;B: 相应的含Z的ApPurZ读值与预期信号相比显示出明显的信号变化;C:当使用包含Z的ApPurZ读值作为输入数据并且包含A的ApPurZ读值作为参考数据时,可以由Tombo(43)检测到A到Z的修饰。KS测试使用两个累积分布曲线之间的最大垂直差。D-统计量描述了两组数据之间的差异,较大的D-统计量表示较大的修饰碱基的概率。Figure 20. Changes in nanopore signal upon Z substitution. A: ApPurZ readings containing A match the expected signal; B: corresponding ApPurZ readings containing Z show significant signal change compared to the expected signal; C: When using ApPurZ readings containing Z as input data and containing A The A to Z modifications can be detected by Tombo (43) when the ApPurZ readings are used as reference data. The KS test uses the largest vertical difference between two cumulative distribution curves. The D-statistic describes the difference between two sets of data, with a larger D-statistic representing a larger probability of modified bases.
图21.含A和含Z的ApPurZ的cwDTW比对得分分布。Figure 21. Distribution of cwDTW alignment scores for A- and Z-containing ApPurZ.
图22.噬菌体基因组DNA和含Z的PCR产物的限制性内切酶消化。A:不动杆菌噬菌体SH-Ab 15497基因组上Sau3AI和TaqI的限制性图谱;B和C:限制酶分别用于消化噬菌体基因组DNA和含Z的ApPurZ PCR产物。Figure 22. Restriction endonuclease digestion of phage genomic DNA and Z-containing PCR products. A: Restriction maps of Sau3AI and TaqI on the genome of Acinetobacter phage SH-Ab 15497; B and C: Restriction enzymes used to digest phage genomic DNA and Z-containing ApPurZ PCR product, respectively.
图23.GpPurZ0的底物特异性。A:GpPurZ0的底物特异性;B:GpPurZ0全反应每隔20分钟检测220nm-340nm处吸收;C:GpPurZ0分别用GTP/dGTP做底物的比色磷酸根测定。Figure 23. Substrate specificity of GpPurZ0. A: Substrate specificity of GpPurZ0; B: Absorbance at 220nm-340nm was detected in the whole reaction of GpPurZ0 every 20 minutes; C: GpPurZ0 colorimetric phosphate assay using GTP/dGTP as substrates, respectively.
图24.GpPurZ0和EcPurB活性的高效液相质谱联用检测结果。A:GpPurZ0,PurB的反应方程式;B:Asp、dGMP和GTP反应,省去GpPurZ0;C:GpPurZ0、dGMP和GTP反应,省去Asp;D:GpPurZ0、dGMP、GTP和Asp的完全反应;E:联合GpPurZ0和EcPurB的两步反应。Figure 24. Results of HPLC-MS detection of GpPurZ0 and EcPurB activities. A: GpPurZ0, reaction equation of PurB; B: Asp, dGMP and GTP reaction, omitting GpPurZ0; C: GpPurZ0, dGMP and GTP reaction, omitting Asp; D: GpPurZ0, dGMP, GTP and Asp complete reaction; E: Two-step reaction combining GpPurZ0 and EcPurB.
图25.GpPurZ0晶体结构解析。A:GpPurZ0的整体结构;B:GpPurZ0的活性中心相互作用的氨基酸以及底物GTP(水解成GDP)的电子云;C:GpPurZ0和VpPurZ(弧菌噬菌体phiVC8)的整体结构对比;D:GpPurZ0和VpPurZ晶体结构的活性中心对比。Figure 25. GpPurZ0 crystal structure elucidation. A: Overall structure of GpPurZ0; B: GpPurZ0 active center interacting amino acids and the electron cloud of the substrate GTP (hydrolyzed to GDP); C: Overall structure comparison of GpPurZ0 and VpPurZ (Vibrio phage phiVC8); D: GpPurZ0 and Active center comparison of the crystal structure of VpPurZ.
图26.真核酵母Z-基因组的表达质粒构建。左图:将ApPurZ,dATPase两个蛋白的基因序列重组克隆到质粒pRS426的基因图谱(pRS426-ApPurZ-dATPase);右图:将DUF550的基因序列重组克隆到质粒pRS425的基因图谱(pRS426-DUF550)。Figure 26. Expression plasmid construction of the eukaryotic yeast Z-genome. Left: The gene map of the two proteins, ApPurZ and dATPase, which were recombinantly cloned into plasmid pRS426 (pRS426-ApPurZ-dATPase); Right: The gene map of DUF550 was recombinantly cloned into plasmid pRS425 (pRS426-DUF550) .
图27.诱导表达含pRS426-ApPurZ-dATPase和pRS426-DUF550两个质粒的酵母基因组中dZ的检测。A图:酵母基因组中dZ的LC-UV检测;B提取dZ的离子色谱图。序列简要说明Figure 27. Detection of dZ in yeast genomes induced to express two plasmids containing pRS426-ApPurZ-dATPase and pRS426-DUF550. Panel A: LC-UV detection of dZ in yeast genome; B ion chromatogram of extracted dZ. Sequence Brief Description
SEQ ID NO:1显示了本申请鉴定并测试的不动杆菌噬菌体(Acinetobacter phage,本文简写为Ap)SH-Ab 15497的2-氨基脱氧腺苷酸琥珀酸酯(ADAS)合成酶(本文简写为PurZ)的氨基酸序列。SEQ ID NO: 1 shows the 2-aminodeoxyadenosuccinate (ADAS) synthase of the Acinetobacter phage (Acinetobacter phage, abbreviated herein as Ap) SH-Ab 15497 identified and tested herein (abbreviated herein as PurZ) amino acid sequence.
SEQ ID NO:2显示了本申请鉴定并测试的华杆菌科(Sinobacteraceae bacterium,本文简写为Sb)细菌噬菌体重叠群的PurZ的氨基酸序列。SEQ ID NO: 2 shows the amino acid sequence of PurZ of the Sinobacteraceae bacterium (abbreviated herein as Sb) bacteriophage contigs identified and tested herein.
SEQ ID NO:3显示了本申请鉴定并测试的沙门氏菌噬菌体(Salmonella phage,本文简写为Sp)的PurZ的氨基酸序列。SEQ ID NO: 3 shows the amino acid sequence of PurZ of the Salmonella phage (Salmonella phage, abbreviated as Sp) identified and tested herein.
SEQ ID NO:4显示了本申请鉴定并测试的弧菌噬菌体(Vibrio phage,本文简写为Vp)的PurZ的氨基酸序列。SEQ ID NO: 4 shows the amino acid sequence of PurZ of the Vibrio phage (abbreviated herein as Vp) identified and tested in the present application.
SEQ ID NO:5显示了本申请鉴定并测试的不动杆菌噬菌体(Ap)SH-Ab 15497的2’-脱氧腺嘌呤5’-三磷酸三磷酸水解酶(本文简写为dATPase)的氨基酸序列。SEQ ID NO: 5 shows the amino acid sequence of the 2'-deoxyadenine 5'-triphosphate triphosphate hydrolase (abbreviated herein as dATPase) of the Acinetobacter phage (Ap) SH-Ab 15497 identified and tested herein.
SEQ ID NO:6显示了本申请鉴定并测试的沙门氏菌噬菌体(Sp)的dATPase的氨基酸序列。SEQ ID NO: 6 shows the amino acid sequence of the dATPase of Salmonella phage (Sp) identified and tested herein.
SEQ ID NO:7显示了本申请鉴定并测试的蓝细菌噬菌体(Cp)的dATPase的氨基酸序列。SEQ ID NO: 7 shows the amino acid sequence of the dATPase of the cyanobacteriophage (Cp) identified and tested herein.
SEQ ID NO:8显示了本申请鉴定并测试的不动杆菌噬菌体(Ap)SH-Ab 15497的dATP和dGTP焦磷酸水解酶(本文简写为DUF550)的氨基酸序列。SEQ ID NO: 8 shows the amino acid sequence of the dATP and dGTP pyrophosphohydrolase (abbreviated herein as DUF550) of Acinetobacter phage (Ap) SH-Ab 15497 identified and tested herein.
SEQ ID NO:9-69显示了本申请鉴定的来自不同噬菌体的PurZ或PurZ0的氨基酸序 列,它们的数据库条目信息如下:SEQ ID NOs: 9-69 show the amino acid sequences of PurZ or PurZO from different bacteriophages identified in this application, and their database entry information is as follows:
Figure PCTCN2022071726-appb-000001
Figure PCTCN2022071726-appb-000001
Figure PCTCN2022071726-appb-000002
Figure PCTCN2022071726-appb-000002
Figure PCTCN2022071726-appb-000003
Figure PCTCN2022071726-appb-000003
Figure PCTCN2022071726-appb-000004
Figure PCTCN2022071726-appb-000004
SEQ ID NO:70显示了蓝细菌噬菌体(Cp)的2003年报道的基因组测序结果中PurZ对应的氨基酸序列,GenBank_AX955019.1。SEQ ID NO: 70 shows the amino acid sequence corresponding to PurZ in the genome sequencing results of cyanobacteriophage (Cp) reported in 2003, GenBank_AX955019.1.
SEQ ID NO:71显示了蓝细菌噬菌体(Cp)的2018年报道的基因组测序结果中PurZ对应的氨基酸序列,SRA_SRR8295598。SEQ ID NO: 71 shows the amino acid sequence corresponding to PurZ in the genome sequencing results of cyanobacteriophage (Cp) reported in 2018, SRA_SRR8295598.
SEQ ID NO:72显示了本申请用于分析、比较和测试中使用的大肠杆菌(本文简写为Ec)的腺苷酸琥珀酸酯合成酶(本文简写为PurA)的氨基酸序列。SEQ ID NO: 72 shows the amino acid sequence of adenylate succinate synthase (abbreviated herein as PurA) of Escherichia coli (abbreviated herein as Ec) used in the analysis, comparison and testing of the present application.
SEQ ID NO:73-91显示了本申请鉴定的来自不同噬菌体的dATPase的氨基酸序列,它们的数据库条目信息如下:SEQ ID NOs: 73-91 show the amino acid sequences of dATPases from different bacteriophages identified in this application, and their database entry information is as follows:
Figure PCTCN2022071726-appb-000005
Figure PCTCN2022071726-appb-000005
Figure PCTCN2022071726-appb-000006
Figure PCTCN2022071726-appb-000006
Figure PCTCN2022071726-appb-000007
Figure PCTCN2022071726-appb-000007
下表中展示了本申请鉴定的2-氨基脱氧腺苷酸琥珀酸酯(ADAS)合成酶的具体序列,区分了PurZ和PurZ0。The specific sequences of the 2-aminodeoxyadenosuccinate (ADAS) synthases identified in this application are shown in the table below, distinguishing between PurZ and PurZ0.
Figure PCTCN2022071726-appb-000008
Figure PCTCN2022071726-appb-000008
Figure PCTCN2022071726-appb-000009
Figure PCTCN2022071726-appb-000009
Figure PCTCN2022071726-appb-000010
Figure PCTCN2022071726-appb-000010
发明详细描述Detailed description of the invention
除非另外指明,本申请中所用的术语具有本领域技术人员通常所理解的含义。Unless otherwise indicated, terms used in this application have the meanings commonly understood by those skilled in the art.
除非另有所指,分别地,核酸以5′至3′方向从左向右书写;氨基酸序列以氨基至羧基方向从左向右书写。数字范围包括限定该范围的数字。氨基酸在本文可以用其通常所知的三字母符号或IUPAC-IUB生物化学命名委员会推荐的一字母符号来表示。同样地,可以用通常接受的单字母码表示核苷酸。参考说明书整体更为充分地定义以上定义的术语。Unless otherwise indicated, nucleic acids are written left to right in 5' to 3' orientation; amino acid sequences are written left to right in amino to carboxy orientation, respectively. Numerical ranges include the numbers defining the range. Amino acids may be represented herein by either their commonly known three-letter symbols or the one-letter symbols recommended by the IUPAC-IUB Biochemical Nomenclature Committee. Likewise, nucleotides can be represented by generally accepted one-letter codes. The terms defined above are more fully defined by reference to the specification as a whole.
本申请中描述氨基酸序列比对的残基位置编号时,指除首位蛋氨酸(M)之外的编号。例如,对于首位为蛋氨酸(M)的氨基酸序列,H10是指M之后的第10位的组氨酸(H)。When describing the numbering of residue positions in an amino acid sequence alignment in this application, it refers to the numbering other than the first methionine (M). For example, for an amino acid sequence with methionine (M) at the first position, H10 refers to histidine (H) at position 10 after M.
本申请的“多肽”和“蛋白”在本文可互换使用,指氨基酸残基的聚合物及其变体和合成的和天然存在的类似物。因此,这些术语适用于天然存在的氨基酸聚合物及其天然存在的化学衍生物,以及其中一个或多个氨基酸残基是合成的非天然存在的氨基酸(诸如相应的天然存在的氨基酸的化学类似物)的氨基酸聚合物。此类衍生物包括例如翻译后修饰和降解产物,包括多肽片段的磷酸化的、糖基化的、氧化的、异构化、羧基化的和脱氨基化的变体。As used herein, "polypeptide" and "protein" are used interchangeably herein to refer to polymers of amino acid residues as well as variants and synthetic and naturally occurring analogs thereof. Accordingly, these terms apply to naturally-occurring amino acid polymers and their naturally-occurring chemical derivatives, as well as to synthetic non-naturally-occurring amino acids (such as chemical analogs of the corresponding naturally-occurring amino acids) in which one or more amino acid residues are ) of amino acid polymers. Such derivatives include, for example, post-translational modifications and degradation products, including phosphorylated, glycosylated, oxidized, isomerized, carboxylated, and deaminated variants of polypeptide fragments.
如本文所用的术语“酶活性中心”是指酶分子中能够直接与底物分子结合,并催化底物化学反应的部位,这一部位就成为酶的活性中心。一般认为活性中心主要由两个功能部位组成:第一个是催化部位,底物的键在此被打断或形成新的键从而发生一定的化学变化;第二个是结合部位,酶的底物靠此部位结合到酶分子上。组成功能部位的是酶分子中在三维结构上比较靠近的少数几个氨基酸残基或是这些残基上的某些基团,它们在一级结构上可能相距甚远,甚至位于不同肽链上,而是通过肽链的盘绕、折叠在空间构象上相互靠近;对于需要辅酶的酶来说,辅酶分子(如金属离子Zn 2+和/或Mn 2+)或辅酶分子的某一部分结构也是功能部位的组成部分。 The term "enzyme active center" as used herein refers to the part of the enzyme molecule that can directly bind to the substrate molecule and catalyze the chemical reaction of the substrate, and this part becomes the active center of the enzyme. It is generally believed that the active center is mainly composed of two functional sites: the first is the catalytic site, where the bond of the substrate is broken or a new bond is formed to undergo certain chemical changes; the second is the binding site, the substrate of the enzyme. The substance binds to the enzyme molecule by this site. The functional site is composed of a few amino acid residues that are relatively close in the three-dimensional structure of the enzyme molecule or some groups on these residues, which may be far apart in the primary structure, or even located on different peptide chains , but close to each other in spatial conformation through the coiling and folding of peptide chains; for enzymes that require coenzymes, coenzyme molecules (such as metal ions Zn 2+ and/or Mn 2+ ) or a certain part of the structure of coenzyme molecules are also functional part of the part.
如本文所用的术语“氨基酸”是指羧酸碳原子上的氢原子被氨基取代后的化合物,氨基酸分子中含有氨基和羧基两种官能团。其包括天然存在的和非天然存在的氨基酸以及氨基酸类似物和模拟物。天然存在的氨基酸包括蛋白生物合成中使用的20种(L)-氨基酸以及其他氨基酸,例如4-羟脯氨酸、羟赖氨酸、羧化赖氨酸、锁链素、异锁链素、高半胱氨酸、瓜氨酸和鸟氨酸。非天然存在的氨基酸包括例如(D)-氨基酸、正亮氨酸、正缬氨酸、p-氟苯丙氨酸、乙基硫氨酸等,这些是本领域技术人员已知的。氨基酸类似物包括天然和非天然存在的氨基酸的修饰形式。这种修饰可以包括例如取代氨基酸上的化学基团和部分,或者氨基酸的衍生化。氨基酸模拟物包括例如表现出功能上类似性质的有机结构,所述性质例如氨基酸的电荷和电荷空间特性。例如,模拟精氨酸(Arg或R)的有机结构具有位于类似分子空间并且具有与天然存在的Arg氨基酸的侧链的e-氨基相同程度的移动性的正电荷部分。模拟物还包括约束结构以维持氨基酸或氨基酸官能团的最佳空间和电荷相互作用。本领域技术人员可以确定什么结构构成功能上等效的氨基酸类似物和氨基酸模拟物。The term "amino acid" as used herein refers to a compound in which a hydrogen atom on a carbon atom of a carboxylic acid is replaced by an amino group, and the amino acid molecule contains two functional groups, an amino group and a carboxyl group. It includes naturally occurring and non-naturally occurring amino acids as well as amino acid analogs and mimetics. Naturally occurring amino acids include the 20 (L)-amino acids used in protein biosynthesis as well as other amino acids such as 4-hydroxyproline, hydroxylysine, carboxylated Cystine, citrulline and ornithine. Non-naturally occurring amino acids include, for example, (D)-amino acids, norleucine, norvaline, p-fluorophenylalanine, ethylthionine, and the like, which are known to those skilled in the art. Amino acid analogs include modified forms of naturally and non-naturally occurring amino acids. Such modifications may include, for example, substitution of chemical groups and moieties on amino acids, or derivatization of amino acids. Amino acid mimetics include, for example, organic structures that exhibit functionally similar properties, such as the charge and charge space properties of amino acids. For example, an organic structure that mimics arginine (Arg or R) has a positively charged moiety located in a similar molecular space and having the same degree of mobility as the e-amino group of the side chain of a naturally occurring Arg amino acid. Mimics also include constrained structures to maintain optimal steric and charge interactions of amino acids or amino acid functional groups. One skilled in the art can determine what structures constitute functionally equivalent amino acid analogs and amino acid mimetics.
如本文所用的术语“同工酶”是指生物体内催化相同反应而分子结构不同的酶。The term "isoenzyme" as used herein refers to an enzyme that catalyzes the same reaction in an organism but differs in molecular structure.
如本文所用的术语“核酸”是指指mRNA、RNA、cRNA、eDNA或DNA,包括单链和双链形式的DNA。该术语通常指至少10个碱基长度的核苷酸多聚形式,所述核苷酸是核糖核苷酸或脱氧核苷酸或任一类型的核苷酸的修饰形式。The term "nucleic acid" as used herein refers to mRNA, RNA, cRNA, eDNA or DNA, including single- and double-stranded forms of DNA. The term generally refers to polymeric forms of nucleotides that are at least 10 bases in length, either ribonucleotides or deoxynucleotides or a modified form of either type of nucleotide.
如本文所用的术语“编码”用于特定核酸的上下文时,指该核酸包含指导该核苷 酸序列翻译成特定蛋白的必需信息。使用密码子表示编码蛋白的信息。编码蛋白的核酸可以包含位于该核酸翻译区内的非翻译序列(例如,内含子)或者可以缺少这样的居间非翻译序列(例如,如同在cDNA中)。The term "encoding" as used herein in the context of a particular nucleic acid means that the nucleic acid contains the necessary information to direct translation of the nucleotide sequence into a particular protein. Use codons to express the information that encodes the protein. A nucleic acid encoding a protein may contain untranslated sequences (eg, introns) located within the translation region of the nucleic acid or may lack such intervening untranslated sequences (eg, as in cDNA).
如本文所用的,涉及特定多核苷酸或其所编码的蛋白的“全长序列”指具有天然(非合成)内源序列的整个核酸序列或整个氨基酸序列。全长多核苷酸编码该特定蛋白的全长、催化活性形式。As used herein, "full-length sequence" in reference to a particular polynucleotide or the protein it encodes refers to the entire nucleic acid sequence or the entire amino acid sequence having native (non-synthetic) endogenous sequence. The full-length polynucleotide encodes the full-length, catalytically active form of this particular protein.
如本文所用的术语“分离的”是指多肽或核酸或其生物学活性部分,其基本上或本质上不含如在其天然存在的环境中所发现的通常伴随或反应于该蛋白质或核酸的组分。因而,用重组技术产生分离的多肽或核酸时,分离的多肽或核酸基本上不含其它细胞物质或培养基,或者化学合成分离的多肽或核酸时,基本上不含化学前体或其它化学品。对于DNA而言,除非另外指明或者根据上下文能够明确判断是指存在于基因组上的一部分区域,则可以理解为“分离的”。The term "isolated" as used herein refers to a polypeptide or nucleic acid, or a biologically active portion thereof, that is substantially or essentially free of the proteins or nucleic acids that normally accompany or react with the protein or nucleic acid as found in its naturally occurring environment. components. Thus, when an isolated polypeptide or nucleic acid is produced by recombinant techniques, the isolated polypeptide or nucleic acid is substantially free of other cellular material or culture medium, or when chemically synthesized, the isolated polypeptide or nucleic acid is substantially free of chemical precursors or other chemicals . With respect to DNA, it is understood to be "isolated" unless otherwise specified or clear from the context to be present in a portion of the genome.
如本文所用的术语“表达载体”是重组或合成产生的核酸构建体,其具有一系列允许特定的核酸在宿主细胞中转录的特异性核酸元件。The term "expression vector" as used herein is a recombinant or synthetically produced nucleic acid construct having a series of specific nucleic acid elements that allow transcription of a specific nucleic acid in a host cell.
如本文所用的术语“宿主细胞”是指在转化和转导(感染)中接受外源基因的细胞。宿主细胞可以是诸如酵母细胞的真核细胞或诸如大肠杆菌的原核细胞。对于涉及噬菌体的宿主细胞的语境下,宿主细胞主要指细菌。The term "host cell" as used herein refers to a cell that receives a foreign gene in transformation and transduction (infection). Host cells can be eukaryotic cells such as yeast cells or prokaryotic cells such as E. coli. In the context of host cells involving bacteriophages, host cells refer primarily to bacteria.
具体实施方案specific implementation
DNA修饰在形式和功能上是不同的,但通常不改变Watson-Crick碱基配对。二氨基嘌呤(Z)是一个独特的例外,因为在蓝细菌噬菌体(cyanophage)中,它完全取代腺嘌呤,并与胸腺嘧啶形成三个氢键。然而,含Z基因组的生物合成、物种分布和重要性尚未探索清楚。在本申请中,报道了支持含Z基因组合成的多酶体系。发明人在全球范围内鉴定了数十种具有这些酶的噬菌体,并且发明人使用LC-UV和质谱进一步验证了这些噬菌体的一个代表性实例不动杆菌(Acinetobacter)噬菌体SH-Ab 15497中的含Z基因组。含Z基因组的作用之一在于能赋予噬菌体进化学的优势,以规避宿主限制性内切酶的攻击。含Z基因组的生物合成途径的发现使得Z-DNA(即,常规DNA中A被替换为Z)能够大规模制备用于多种应用。DNA modifications are different in form and function, but generally do not alter Watson-Crick base pairing. Diaminopurine (Z) is a unique exception because in cyanophages it completely replaces adenine and forms three hydrogen bonds with thymine. However, the biosynthesis, species distribution and importance of Z-containing genomes have not been fully explored. In the present application, a multi-enzyme system supporting the synthesis of Z-containing gene sets is reported. The inventors identified dozens of phages with these enzymes worldwide, and the inventors used LC-UV and mass spectrometry to further validate the inclusion of a representative example of these phages, Acinetobacter phage SH-Ab 15497. Z genome. One of the roles of Z-containing genomes is to confer evolutionary advantages on phages to evade host restriction endonuclease attack. The discovery of a Z-genome-containing biosynthetic pathway has enabled large-scale production of Z-DNA (ie, A is replaced by a Z in conventional DNA) for a variety of applications.
第一方面,本申请提供了一种多肽,所述多肽具有2-氨基脱氧腺苷酸琥珀酸酯(ADAS)合成酶活性,能够以ATP、dATP、GTP、dGTP中的一种或多种以及dGMP和Asp为底物催化形成2-氨基脱氧腺苷酸琥珀酸酯,并且相比于腺苷酸琥珀酸酯合成酶(PurA)的GDxxKG催化基序,所述多肽的催化基序改变为GSxxKG,其中x代表任意氨基酸残基。In a first aspect, the present application provides a polypeptide having 2-aminodeoxyadenosuccinate (ADAS) synthase activity, capable of producing one or more of ATP, dATP, GTP, dGTP, and dGMP and Asp are substrates to catalyze the formation of 2-aminodeoxyadenosuccinate succinate, and compared with the GDxxKG catalytic motif of adenylate succinate synthase (PurA), the catalytic motif of the polypeptide is changed to GSxxKG , where x represents any amino acid residue.
2-氨基脱氧腺苷酸琥珀酸酯(ADAS)合成酶,本申请称为PurZ或PurZ0,是本申请发明人首次鉴定并表征的酶,是参与Z碱基合成的关键酶。PurZ或PurZ0与本领域中已知的参与腺嘌呤(A)合成的PurA属于广义的同系物,反应机制相似(参见图1中的A),但是底物特异性的不同决定了两者反应的不同。PurA的催化基序已经得到表征和揭示,为GDxxKG,例如参见图6中EcPurA的12-17位,对于本申请鉴定的全部PurZ或PurZ0而言,其催化基序均改变为GSxxKG,根据发明人分析,体积较小的S替换体积较大的D可以容纳底物的另外的2-氨基,从而产生了不同的底物选择性和特异性。因此,GSxxKG被证明为PurZ或PurZ0的催化基序。在一些实施方案中,PurZ或PurZ0的催化基序为GSTGKG。2-Aminodeoxyadenoylate succinate (ADAS) synthase, referred to in the present application as PurZ or PurZ0, is an enzyme identified and characterized for the first time by the inventors of the present application, and is a key enzyme involved in the synthesis of Z bases. PurZ or PurZ0 and PurA known in the art to participate in the synthesis of adenine (A) belong to a generalized homologue, and the reaction mechanism is similar (see A in Figure 1), but the difference in substrate specificity determines the reaction between the two. different. The catalytic motif of PurA has been characterized and revealed to be GDxxKG, for example, see positions 12-17 of EcPurA in Fig. 6, for all PurZ or PurZ0 identified in this application, the catalytic motif is changed to GSxxKG, according to the inventors Analyses, the replacement of the bulkier D by the smaller S can accommodate the additional 2-amino group of the substrate, resulting in different substrate selectivity and specificity. Therefore, GSxxKG was shown to be the catalytic motif of PurZ or PurZ0. In some embodiments, the catalytic motif of PurZ or PurZO is GSTGKG.
在一些实施方案中,当所述多肽与来自大肠杆菌的腺苷酸琥珀酸酯合成酶(SEQ ID NO:72,EcPurA)的氨基酸序列比对时,对应于SEQ ID NO:72的303位由R改变为L。In some embodiments, the polypeptide corresponding to position 303 of SEQ ID NO:72 consists of R is changed to L.
序列比对是本领域技术人员常用的技术并且有常规比对工具(例如BLASTn、BLASTp、BLASTx等)。此外,由于PurZ或PurZ0与PurA属于的同系物家族,两者的比对也是容易实现的。PurZ或PurZ0相比于PurA而言,例如EcPurA中303位由R改变为脂肪族氨基酸L,这与PurZ或PurZ0的底物为脱氧核糖核苷酸(而不是核糖核苷酸)是相符合的。Sequence alignment is a technique commonly used by those skilled in the art and there are conventional alignment tools (eg, BLASTn, BLASTp, BLASTx, etc.). Furthermore, due to the homolog family to which PurZ or PurZ0 and PurA belong, alignment of the two is also easily achieved. PurZ or PurZ0 is compared to PurA, for example, in EcPurA, position 303 is changed from R to aliphatic amino acid L, which is consistent with the fact that the substrate of PurZ or PurZ0 is deoxyribonucleotide (rather than ribonucleotide) .
在一些实施方案中,当所述多肽与SEQ ID NO:2(SbPurZ)所示的氨基酸序列比对时,具有对应于SEQ ID NO:2的274位的T、306位的N、307位的F和309位的N。上述位点是发明人鉴定的能够对反应活性产生一定影响的相对重要的残基。In some embodiments, when the polypeptide is aligned with the amino acid sequence set forth in SEQ ID NO: 2 (SbPurZ), it has T corresponding to position 274, N at position 306, position 307 of SEQ ID NO: 2 F and 309 bits of N. The above sites are relatively important residues identified by the inventors as being able to exert some influence on the reactivity.
在一些实施方案中,所述多肽包含SEQ ID NO:1-4、9-69、71和92-146中任一项所示的序列,或SEQ ID NO:1-4、9-69、71和92-146中任一项所示的序列具有一个或多个氨基酸的插入、缺失和/取代的保留2-氨基脱氧腺苷酸琥珀酸酯(ADAS)合成酶活性的变体,或SEQ ID NO:1-4、9-69、71和92-146中任一项所示的序列的保留催化基序的片段。In some embodiments, the polypeptide comprises the sequence set forth in any one of SEQ ID NOs: 1-4, 9-69, 71, and 92-146, or SEQ ID NOs: 1-4, 9-69, 71 and a variant of the sequence shown in any one of 92-146 having one or more amino acid insertions, deletions and/or substitutions that retain 2-aminodeoxyadenosuccinate (ADAS) synthetase activity, or SEQ ID Fragments of the sequences set forth in any of NO: 1-4, 9-69, 71 and 92-146 that retain the catalytic motif.
SEQ ID NO:1-4是本申请鉴定并测试的PurZ实例,SEQ ID NO:71是蓝细菌噬菌体的PurZ氨基酸序列。SEQ ID NO:9-69和92-146是发明人在已知噬菌体基因数据库中按照序列相似度和进化树分析得到的其他噬菌体的PurZ或PurZ0序列。经过比分析,它们与本申请鉴定并测试的PurZ或PurZ0实例具有相符的催化基序和重要残基,预期具有PurZ或PurZ0功能。SEQ ID NO: 1-4 are examples of PurZ identified and tested in this application, and SEQ ID NO: 71 is the PurZ amino acid sequence of cyanobacterial phage. SEQ ID NOs: 9-69 and 92-146 are the PurZ or PurZ0 sequences of other phages obtained by the inventors in the known phage gene database according to sequence similarity and phylogenetic tree analysis. After comparative analysis, they have consistent catalytic motifs and important residues with the PurZ or PurZ0 examples identified and tested in this application, and are expected to have PurZ or PurZ0 functions.
发明人对于PurZ或PurZ0的结构和功能进行了充分的表征,鉴定了催化基序和重要残基,在此基础上,非催化和结合结构域的一个或多个氨基酸的插入、缺失和/取代预期不会影响PurZ或PurZ0功能。在一些实施方案中,氨基酸插入、取代和/或缺失的数目为1-30个,优选为1-20个,更优选为1-10个,例如1、2、3、4、5、6、7、8、9、或10个氨基酸的插入、取代和/或缺失。The inventors have fully characterized the structure and function of PurZ or PurZ0, identified catalytic motifs and important residues, based on which insertions, deletions and/or substitutions of one or more amino acids in the non-catalytic and binding domains Not expected to affect PurZ or PurZ0 functionality. In some embodiments, the number of amino acid insertions, substitutions and/or deletions is 1-30, preferably 1-20, more preferably 1-10, such as 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acid insertions, substitutions and/or deletions.
相似地,在知晓催化基序和重要残基的情况下,进行适当截短而得到的片段预期也是有功能的。Similarly, given knowledge of the catalytic motif and important residues, fragments resulting from appropriate truncations are expected to be functional.
第二方面,本申请提供了一种多肽,所述多肽具有2’-脱氧腺嘌呤5’-三磷酸三磷酸水解酶(dATPase)活性,能够催化dATP水解生成2’-脱氧腺嘌呤(dA),所述多肽包含金属和配体结合口袋。在一些实施方案中,所述多肽还能催化dADP和dAMP水解生成2’-脱氧腺嘌呤(dA)。In a second aspect, the present application provides a polypeptide, which has 2'-deoxyadenine 5'-triphosphate triphosphate hydrolase (dATPase) activity and can catalyze the hydrolysis of dATP to generate 2'-deoxyadenine (dA) , the polypeptide contains metal and ligand binding pockets. In some embodiments, the polypeptide also catalyzes the hydrolysis of dADP and dAMP to 2'-deoxyadenine (dA).
dATPase是发明人发现并鉴定的参与Z基因组生物合成的另一种酶。由于噬菌体寄生的宿主可能会产生合成A碱基的前体dATP,dATPase的作用之一包括通过特异性地除去dATP及其前体dADP来促进含Z基因组合成,防止A被并入噬菌体基因组中。dATPase is another enzyme involved in Z genome biosynthesis discovered and identified by the inventors. Since the phage-infested host may produce the precursor dATP for the synthesis of A bases, one of the roles of dATPase includes promoting Z-containing gene group synthesis by specifically removing dATP and its precursor dADP, preventing the incorporation of A into the phage genome.
在一些实施方案中,所述多肽包含Co 2+作为二价金属辅因子。 In some embodiments, the polypeptide comprises Co 2+ as a divalent metal cofactor.
在一些实施方案中,所述多肽包含SEQ ID NO:5-7和73-91中任一项所示的序列,或SEQ ID NO:5-7和73-91中任一项所示的序列具有一个或多个氨基酸的插入、缺失和/取代的保留dATPase活性的变体,或SEQ ID NO:5-7和73-91中任一项所示的序列的保留催化基序的片段。In some embodiments, the polypeptide comprises the sequence set forth in any one of SEQ ID NOs: 5-7 and 73-91, or the sequence set forth in any one of SEQ ID NOs: 5-7 and 73-91 A variant having one or more amino acid insertions, deletions and/or substitutions that retains dATPase activity, or a fragment of the sequence set forth in any of SEQ ID NOs: 5-7 and 73-91 that retains the catalytic motif.
SEQ ID NO:5-7是本申请鉴定并测试的dATPase实例。同样地,发明人对于dATPase的催化机制也进行了表征,在此基础上获得具有相同/相似dATPase功能的变体和片段是可以预期的。SEQ ID NO:73-91是发明人在已知噬菌体基因数据库中按照序列相似度和进化树分析得到的其他噬菌体的dATPase序列。经过比分析,它们与本申请鉴定并测试的dATPase实例具有相符的催化基序和重要残基,预期具有dATPase功能。SEQ ID NOs: 5-7 are examples of dATPases identified and tested in this application. Likewise, the inventors have also characterized the catalytic mechanism of dATPase, on the basis of which variants and fragments with the same/similar dATPase function are expected. SEQ ID NOs: 73-91 are the dATPase sequences of other phages obtained by the inventors in the known phage gene database according to sequence similarity and phylogenetic tree analysis. After comparative analysis, they have consistent catalytic motifs and important residues with the dATPase examples identified and tested in this application, and are expected to have dATPase functions.
第三方面,本申请提供了一种多肽,所述多肽具有dATP和dGTP焦磷酸水解酶活性,能够催化dATP水解成dAMP以及催化dGTP水解成dGMP,所述多肽包含金属 和配体结合口袋。In a third aspect, the application provides a polypeptide having dATP and dGTP pyrophosphohydrolase activities, capable of catalyzing the hydrolysis of dATP to dAMP and dGTP to dGMP, the polypeptide comprising a metal and ligand binding pocket.
DUF550是发明人发现并鉴定的参与Z基因组生物合成的另一种酶。DUF550能够催化dGTP水解成dGMP,而dGMP是PurZ或PurZ0的底物之一,DUF550的存在能提高dZTP水平,同时还能消耗dATP以进一步促进Z的并入。DUF550 is another enzyme involved in Z genome biosynthesis discovered and identified by the inventors. DUF550 can catalyze the hydrolysis of dGTP to dGMP, which is one of the substrates of PurZ or PurZ0, and the presence of DUF550 can increase the level of dZTP, while also depleting dATP to further promote the incorporation of Z.
在一些实施方案中,所述多肽包含Co 2+作为二价金属辅因子。 In some embodiments, the polypeptide comprises Co 2+ as a divalent metal cofactor.
在一些实施方案中,所述多肽包含SEQ ID NO:8所示的序列,或SEQ ID NO:8所示的序列具有一个或多个氨基酸的插入、缺失和/取代的保留dATP和dGTP焦磷酸水解酶活性的变体,或SEQ ID NO:8所示的序列的保留催化基序的片段。In some embodiments, the polypeptide comprises the sequence set forth in SEQ ID NO: 8, or the sequence set forth in SEQ ID NO: 8 with one or more amino acid insertions, deletions and/or substitutions retained dATP and dGTP pyrophosphate A variant of hydrolase activity, or a fragment of the sequence set forth in SEQ ID NO:8 that retains the catalytic motif.
SEQ ID NO:8是本申请鉴定并测试的DUF550的实例。同样地,发明人对于DUF550的催化机制也进行了表征,在此基础上获得具有相同/相似DUF550功能的变体和片段是可以预期的。SEQ ID NO: 8 is an example of DUF550 identified and tested herein. Likewise, the inventors have also characterized the catalytic mechanism of DUF550, and it is expected to obtain variants and fragments with the same/similar DUF550 function on this basis.
第一至第三方面的多肽均可以参与噬菌体的Z碱基合成通路,其中第一方面所述的多肽是最为关键的一种。dATPase和DUF550不参与合成反应,对于含Z基因组合成做出贡献。The polypeptides of the first to third aspects can all participate in the Z base synthesis pathway of phage, of which the polypeptide described in the first aspect is the most critical one. dATPase and DUF550 do not participate in the synthesis reaction and contribute to the synthesis of Z-containing gene groups.
第四方面,本申请提供了编码第一至第三方面所述的多肽的核酸分子。In a fourth aspect, the present application provides nucleic acid molecules encoding the polypeptides of the first to third aspects.
作为编码序列的核酸分子可以与其他DNA序列组合,例如启动子、聚腺苷化信号、其他限制性酶切位点、多克隆位点、其他编码节段等。Nucleic acid molecules that are coding sequences can be combined with other DNA sequences, such as promoters, polyadenylation signals, other restriction sites, multiple cloning sites, other coding segments, and the like.
可以利用本领域内已知的和可获得的多种成熟技术中的任何一种来制备、操控和/或表达核酸及其融合物。例如,编码本申请的多肽或其变体和片段的核酸分子可以用于重组DNA分子中以指导多肽在适当的宿主细胞中表达。由于遗传密码子固有的简并性,编码基本上相同或在功能上等同的氨基酸序列的其他DNA序列也可以用于本申请中,并且这些序列可以用于克隆和表达给定的多肽。Nucleic acids and fusions thereof can be prepared, manipulated and/or expressed using any of a variety of well-established techniques known and available in the art. For example, nucleic acid molecules encoding the polypeptides of the present application, or variants and fragments thereof, can be used in recombinant DNA molecules to direct expression of the polypeptides in appropriate host cells. Due to the inherent degeneracy of the genetic code, other DNA sequences encoding substantially identical or functionally equivalent amino acid sequences can also be used in the present application, and these sequences can be used to clone and express a given polypeptide.
此外,可以使用本领域内公知的方法改造本申请的核酸分子,包括但不限于基因产物的克隆、加工、表达和/或活性的改变。In addition, the nucleic acid molecules of the present application can be engineered using methods well known in the art, including but not limited to cloning, processing, expression and/or alteration of activity of the gene product.
在一些实施方案中,核酸分子通过人工合成产生,例如直接的化学合成或酶合成。In some embodiments, nucleic acid molecules are produced by artificial synthesis, such as direct chemical synthesis or enzymatic synthesis.
在一些实施方案中,所述核酸分子通过重组技术产生。In some embodiments, the nucleic acid molecule is produced by recombinant techniques.
在一些实施方案中,所述核酸分子为分离的核酸分子。In some embodiments, the nucleic acid molecule is an isolated nucleic acid molecule.
第五方面,本申请提供了包含第四方面所述的核酸分子的载体。In a fifth aspect, the present application provides a vector comprising the nucleic acid molecule of the fourth aspect.
第四方面所述的核酸分子可以以表达盒的方式提供于载体中。表达盒可以另外包含5′前导序列,所述前导序列能发挥增强翻译的作用。在制备表达盒时,可以操作各种DNA片段以提供合适方向的和适当时合适读框的DNA序列。为达到这一目的,可以采用接头或连接子来连接DNA片段,或者可以涉及其它操作来提供方便的限制性位点、移除多余DNA、移除限制性位点等等。为了这一目的,可以涉及体外诱变,引物修复,限制性,退火,再取代,例如转换(transition)和颠换(transversion)。The nucleic acid molecule of the fourth aspect can be provided in a vector in the form of an expression cassette. The expression cassette may additionally contain a 5' leader sequence that acts to enhance translation. In preparing expression cassettes, various DNA fragments can be manipulated to provide DNA sequences in the appropriate orientation and, where appropriate, in the appropriate reading frame. To this end, linkers or linkers may be employed to join DNA fragments, or other manipulations may be involved to provide convenient restriction sites, remove excess DNA, remove restriction sites, and the like. For this purpose, in vitro mutagenesis, primer repair, restriction, annealing, re-substitutions such as transitions and transversions may be involved.
第六方面,本申请提供了改造噬菌体的方法,包括向噬菌体的基因组引入编码第一方面所述的多肽的核酸分子,以所述噬菌体表达第一方面所述的多肽。In a sixth aspect, the present application provides a method for modifying a phage, comprising introducing a nucleic acid molecule encoding the polypeptide of the first aspect into the genome of the phage, and expressing the polypeptide of the first aspect by the phage.
噬菌体工程改造是本领域技术人员能够操作的,对于天然含有正常的A碱基基因组的工程化噬菌体而言,通过PurZ或PurZ0的引入则可能实现Z碱基的合成。Phage engineering can be performed by those skilled in the art. For engineered phages that naturally contain a normal A-base genome, Z-base synthesis may be achieved through the introduction of PurZ or PurZ0.
在一些实施方案中,方法还包括向所述噬菌体的基因组引入编码第二方面所述的多肽的核酸分子,以所述噬菌体表达第二方面所述的多肽;和/或向所述噬菌体的基因组引入编码第三方面所述的多肽的核酸分子,以所述噬菌体表达第三方面所述的多肽。In some embodiments, the method further comprises introducing into the genome of the bacteriophage a nucleic acid molecule encoding the polypeptide of the second aspect, expressing the polypeptide of the second aspect with the bacteriophage; and/or into the genome of the bacteriophage A nucleic acid molecule encoding the polypeptide of the third aspect is introduced, and the polypeptide of the third aspect is expressed by the phage.
如上文所述,工程化改造进一步引入dATPase和DUF550能够促进含Z基因组的合成。As described above, engineering further introduction of dATPase and DUF550 can promote the synthesis of Z-containing genomes.
在一些实施方案中,方法还包括向所述噬菌体的基因组引入编码腺苷酸琥珀酸裂 解酶(PurB)的核酸分子,以所述噬菌体表达腺苷酸琥珀酸裂解酶(PurB);和/或向所述噬菌体的基因组引入编码GMP激酶(GK)的核酸分子,以所述噬菌体表达GMP激酶(GK)。In some embodiments, the method further comprises introducing a nucleic acid molecule encoding an adenosuccinate lyase (PurB) into the genome of the bacteriophage, expressing the adenosuccinate lyase (PurB) with the bacteriophage; and/or A nucleic acid molecule encoding GMP kinase (GK) is introduced into the genome of the phage to express GMP kinase (GK) from the phage.
PurB和GK是基因组合成所需要的,但是也可以由宿主细菌提供。PurB and GK are required for gene assembly, but can also be provided by host bacteria.
第七方面,本申请提供了通过第六方面所述的方法获得的噬菌体。In a seventh aspect, the present application provides a phage obtained by the method of the sixth aspect.
第八方面,本申请提供了包含第七方面所述的噬菌体的宿主细胞。In an eighth aspect, the present application provides a host cell comprising the phage of the seventh aspect.
在一些实施方案中,宿主细胞为细菌细胞。In some embodiments, the host cell is a bacterial cell.
第九方面,本申请提供了第七方面所述的噬菌体或第八方面所述的宿主细胞在二氨基嘌呤脱氧核糖核苷酸(dZTP)合成、DNA合成、DNA折纸、基于DNA的数据存储、抗菌药物制备、杀菌剂制备或防腐剂制备中的用途。在本申请中,发明人报道了含Z基因组的生物合成系统,其可以促进Z取代的DNA的生产,后续可能用于多种新兴应用,例如DNA折纸(origami)(23)和基于DNA的数据存档,由于其高存储容量而具有巨大的潜力(24)。将含Z基因组生物合成酶引入工程噬菌体中可以扩大它们的宿主范围和效力,例如,已经成功地用于临床并且从多重耐药鲍氏不动杆菌(A.baumannii)(25)或脓肿分枝杆菌(Mycobacterium abscessus)(26)感染挽救生命的噬菌体疗法、食物保存(27)和环保目的。增加碱基多样性是本领域长久以来的诉求(28-30),发明人的工作表明自然界已经提供了这样的做法。考虑到Z碱基在陨石中发现,发明人的工作可能对生命起源和天体生物学的跨学科研究提供帮助(31)。In a ninth aspect, the application provides the phage of the seventh aspect or the host cell of the eighth aspect in diaminopurine deoxyribonucleotide (dZTP) synthesis, DNA synthesis, DNA origami, DNA-based data storage, Use in the preparation of antibacterials, bactericides or preservatives. In the present application, the inventors report a Z-genome-containing biosynthetic system that can facilitate the production of Z-substituted DNA with potential subsequent use in a variety of emerging applications, such as DNA origami (23) and DNA-based data Archiving, which has great potential due to its high storage capacity (24). Introducing Z-genome-containing biosynthetic enzymes into engineered phages can expand their host range and potency, for example, which have been successfully used clinically and branched from multidrug-resistant A. baumannii (25) or abscesses Mycobacterium abscessus (26) infection for life-saving phage therapy, food preservation (27) and environmental protection purposes. Increasing base diversity has been a long-standing quest in the art (28-30), and the inventors' work shows that nature already provides such an approach. Considering the discovery of Z bases in meteorites, the inventors' work may be useful for interdisciplinary research on the origin of life and astrobiology (31).
实施例Example
下述实施例仅是说明性的,并不意图限制本申请实施方案的范围或者所附的权利要求的范围。The following examples are illustrative only and are not intended to limit the scope of the embodiments of the present application or the scope of the appended claims.
实施例1.PurZ的鉴定、表征和测试Example 1. Identification, Characterization and Testing of PurZ
材料和方法Materials and methods
蓝细菌噬菌体S-2L基因组的注释和CpPurZ的鉴定Annotation of the cyanobacteriophage S-2L genome and identification of CpPurZ
蓝细菌噬菌体S-2L的全基因组已被测序,并在2003年进行了相关专利申请(NCBI登录号:AX955019,长度:45570bp,SEQ ID No.70)(8),并在2018年使用二代测序技术(NGS)进行了重新测序(NCBI登录号:SRR8295598,SEQ ID No.71)。NGS测序数据未组装。使用SPAdes 3.14.0(http://cab.spbu.ru/software/spades/),发明人将NGS读值装配到44485bp长的支架上。NGS基因组序列在许多位置明显不同于之前申请专利的基因组序列。为了鉴定参与含Z基因组生物合成的候选基因,发明人使用来自SofiBerry平台(http://www.softberry.com/)的FgenesV/FgenesV0软件注释了两个版本的蓝细菌噬菌体S-2L基因组。在专利申请基因组中,从15407-14232位鉴定出PurZ基因,基因长度为1176bp,编码391aa蛋白。在NGS基因组中,PurZ基因被鉴定为从15122位开始并终止于16202位,基因长度为1080bp,编码359aa的蛋白质。通过与其它PurZ CDS进行比对,发明人发现两个版本之间的差异来自C-末端的插入缺失(indel),并且NGS版本应该是正确的版本。The whole genome of cyanobacterial phage S-2L has been sequenced, and a related patent application was made in 2003 (NCBI accession number: AX955019, length: 45570bp, SEQ ID No. 70) (8), and the second generation was used in 2018 Resequencing was performed by sequencing technology (NGS) (NCBI accession number: SRR8295598, SEQ ID No. 71). NGS sequencing data was not assembled. Using SPAdes 3.14.0 (http://cab.spbu.ru/software/spades/), the inventors assembled NGS reads onto 44485 bp long scaffolds. The NGS genome sequence differs significantly from the previously patented genome sequence in a number of locations. To identify candidate genes involved in Z-containing genome biosynthesis, the inventors annotated two versions of the cyanobacterial phage S-2L genome using the FgenesV/FgenesV0 software from the SofiBerry platform (http://www.softberry.com/). In the patent application genome, the PurZ gene was identified from positions 15407-14232, the gene length was 1176bp, and it encoded 391aa protein. In the NGS genome, the PurZ gene was identified as starting at position 15122 and ending at position 16202, with a gene length of 1080 bp and encoding a protein of 359 aa. By aligning with other PurZ CDS, the inventors found that the difference between the two versions comes from the C-terminal indel, and the NGS version should be the correct version.
被错误注释为细菌来源的宏基因组中的含PurZ重叠群可能具有噬菌体来源PurZ-containing contigs in metagenomes misannotated as bacterial origin may have phage origin
含有PurZ的三个重叠群(GenBank:SSEA01000061.1、QQVW01000181.1和NHHH01000025.1,相应的PurZ是图2中A中的三个橙色节点)在NCBI中被注释为细菌来源,对它们进行BLASTp检索。结果表明,几乎所有编码的蛋白质都与噬菌体序列高度相似,例如噬菌体终止酶、衣壳、尾纤维、尾环和底板蛋白,提示它们是噬菌体而不是细菌来源(表2)。Three contigs containing PurZ (GenBank: SSEA01000061.1, QQVW01000181.1 and NHHH01000025.1, the corresponding PurZ are the three orange nodes in A in Fig. 2) were annotated in NCBI as bacterial origin, and they were subjected to BLASTp retrieve. The results showed that almost all encoded proteins were highly similar to phage sequences, such as phage terminators, capsids, tail fibers, tail loops, and floorplate proteins, suggesting that they were of phage rather than bacterial origin (Table 2).
CpPurZ、SbPurZ、CpdATPase和分子对接的同源性建模Homology modeling of CpPurZ, SbPurZ, CpdATPase and molecular docking
HHPred(32)用于发现CpPurZ、SbPurZ和CpdATPase的模板。在排序最高的结构中,PDB 1CIB(EcPurA)(10)和1MEZ(MmPurA)(33)被分别用作为CpPurZ和SbPurZ建立模型的模板,因为这些结构与所有的底物/底物模拟物或产物复合。PurZ的两种模型都结合三种底物:dGMP、ATP和Asp。HHPred (32) was used to discover templates for CpPurZ, SbPurZ and CpdATPase. Among the highest ranked structures, PDB 1CIB(EcPurA) (10) and 1MEZ(MmPurA) (33) were used as templates for modeling CpPurZ and SbPurZ, respectively, as these structures are related to all substrates/substrate mimics or products complex. Both models of PurZ bind three substrates: dGMP, ATP and Asp.
对于CpdATPase,选择PDB:5TK7(34)作为模板以建立CpdATPase的同源模型,PDB:5TK7是一种Co 2+依赖性oxetanocin-A三磷酸/单磷酸磷酸酶或dATP/dAMP磷酸酶,其仅具有18%的总体序列同一性,但具有与CpdATPase相同的金属结合残基。 For CpdATPase, PDB:5TK7(34) was chosen as template to build the homology model of CpdATPase, PDB: 5TK7 is a Co-dependent oxetanocin-A triphosphate/monophosphate phosphatase or dATP/dAMP phosphatase, which only Has 18% overall sequence identity but shares the same metal-binding residues as CpdATPase.
所有结构模型使用Prime v.5.3在
Figure PCTCN2022071726-appb-000011
Suite 2019-1中构建。然后,将模型用于分子对接和结构分析。发明人在
Figure PCTCN2022071726-appb-000012
Suite 2019-1中使用Glide SP将KEGG化合物(35)(代谢物库含有约18000种代谢物)对接到PurZ中的IMP和GTP结合口袋。来自PurZ对接结果的排序最高的配体表明IMP位点偏好具有鸟嘌呤碱基的核苷酸/脱氧核苷酸,而GTP位点对碱基没有强偏好性。对于CpdATPase,发明人将来自KEGG化合物的所有核苷衍生物和dZTP对接至底物结合口袋。CpdATPase的配体结合口袋是相当开放的并且可以容纳所有常见的dNTP。
All structural models use Prime v.5.3 in
Figure PCTCN2022071726-appb-000011
Built in Suite 2019-1. The model is then used for molecular docking and structural analysis. inventor in
Figure PCTCN2022071726-appb-000012
Glide SP was used in Suite 2019-1 to dock the KEGG compound (35) (a metabolite library containing approximately 18,000 metabolites) to the IMP and GTP binding pockets in PurZ. The highest ranked ligands from the PurZ docking results indicate that the IMP site has a preference for nucleotides/deoxynucleotides with guanine bases, whereas the GTP site does not have a strong preference for bases. For CpdATPase, the inventors docked all nucleoside derivatives and dZTP from KEGG compounds to the substrate binding pocket. The ligand-binding pocket of CpdATPase is fairly open and can accommodate all common dNTPs.
在本申请的过程中,与三种底物/底物类似物中的一种和两种复合的VpPurZ的晶体结构在蛋白质数据库(PDB:6FM1系列)中放出。总体折叠和活性位点结构与发明人的同源性模型相似。During the course of this application, the crystal structures of VpPurZ in complex with one and two of the three substrates/substrate analogs were released in the Protein Data Bank (PDB: 6FM1 series). The overall fold and active site structure are similar to the inventors' homology model.
PurA超家族和PurZ家族的序列相似性网络(SSN)Sequence similarity network (SSN) of the PurA superfamily and the PurZ family
使用基于网络的工具EFI-EST( https://efi.igb.illinois.edu/efi-est/)(36)生成PurA超家族的SSN(Pfam登录号:PF00709)。通过Cytoscape 3.7.0(37)以10 -160的e值显示SSN,每个节点代表具有≥85%序列同一性的一组序列。76种蛋白质(62种来自噬菌体,12种来自宏基因组,2种来自古细菌)被推定为PurZ。将它们的序列用于构建单独的SSN,以10 -70的e值显示。在本申请的过程中,另外发现56种额外的PurZ序列,在表1中示出。 The SSN of the PurA superfamily (Pfam accession number: PF00709) was generated using the web-based tool EFI-EST ( https://efi.igb.illinois.edu/efi-est/)(36 ). SSNs were displayed by Cytoscape 3.7.0 (37) with e-values of 10-160 , each node representing a set of sequences with >85% sequence identity. Seventy-six proteins (62 from phage, 12 from metagenome, and 2 from archaea) were putatively named PurZ. Their sequences were used to construct individual SSNs, shown with e-values of 10-70 . During the course of this application, an additional 56 additional PurZ sequences were discovered, shown in Table 1.
PurA和PurZ家系的多序列比对Multiple sequence alignment of PurA and PurZ families
MAFFT版本7.450(38)用于产生PurA超家族(Pfam:PF00709)、PurZ家族、选择的代表性PurA/PurZ序列和CpdATPase同系物的多序列比对。基于PurA/PurZ序列和dATPase序列的多序列比对,将配体结合的关键残基用于通过WebLogo 3(39)产生序列标识。MAFFT version 7.450(38) was used to generate multiple sequence alignments of the PurA superfamily (Pfam: PF00709), the PurZ family, selected representative PurA/PurZ sequences and CpdATPase homologs. Based on multiple sequence alignments of PurA/PurZ sequences and dATPase sequences, key residues for ligand binding were used to generate sequence identity by WebLogo 3 (39).
PurZ的基因组背景The genomic background of PurZ
利用EFI-GNT(https://efi.igb.illinois.edu/efi-gnt/),使用PurZ家族的SSN产生基因组邻域网络(GNN)(40)。收集并分析PurZ的10个上游和10个下游相邻基因。对于噪声滤波,共现百分比被设置为5%。Using EFI-GNT (https://efi.igb.illinois.edu/efi-gnt/), Genome Neighborhood Networks (GNNs) were generated using SSNs of the PurZ family (40). 10 upstream and 10 downstream adjacent genes of PurZ were collected and analyzed. For noise filtering, the co-occurrence percentage was set to 5%.
鲍氏不动杆菌(Acinetobacter baumannii)中限制性内切酶鉴定Identification of restriction endonucleases in Acinetobacter baumannii
通过在鲍氏不动杆菌中搜索″限制性内切核酸酶″,从UniProt收集26种潜在的限制性内切核酸酶。为了鉴定与表征的限制性内切核酸酶具有高度相似性的真实(bona fide)限制性内切核酸酶,对Swiss-Prot数据库进行Blastp。结果表明,26种限制性内切酶中有3种被鉴定出:A0A0B2XQW9、A0A3R9RVJ0和A0A241ZGY5,其中A0A0B2XQW9与P00642(EcoRI)具有87.7%的序列同一性,切割位点为G/AATTC;A0A3R9RVJ0与P00640(PstI)具有75.9%的序列同一性,切割位点为CTGCA/G;A0A241ZGY5与 P33562(BsuBI)具有59.9%的序列同一性,切割位点为CTGCA/G。选择EcoRI和PstI用于进一步的研究。Twenty-six potential restriction endonucleases were collected from UniProt by searching for "restriction endonucleases" in Acinetobacter baumannii. To identify bona fide restriction endonucleases with high similarity to the characterized restriction endonucleases, Blastp was performed on the Swiss-Prot database. The results showed that 3 out of 26 restriction endonucleases were identified: A0A0B2XQW9, A0A3R9RVJ0 and A0A241ZGY5, of which A0A0B2XQW9 had 87.7% sequence identity with P00642 (EcoRI), and the cleavage site was G/AATTC; A0A3R9RVJ0 and P00640 (PstI) had 75.9% sequence identity, and the cleavage site was CTGCA/G; A0A241ZGY5 had 59.9% sequence identity with P33562 (BsuBI), and the cleavage site was CTGCA/G. EcoRI and PstI were selected for further studies.
实验材料Experimental Materials
LB培养基购自Oxoid Limited(Hampshire,UK)。使用来自Millipore Direct-Q的超纯去离子水。TALON树脂购自Clontech Laboratories Inc(California,USA)。所有蛋白质纯化色谱实验在
Figure PCTCN2022071726-appb-000013
纯FPLC系统(GE Healthcare,USA)上进行。dZTP购自Trilink(California,USA)。其它核苷酸购自Sigma-Aldrich。
LB medium was purchased from Oxoid Limited (Hampshire, UK). Ultrapure deionized water from Millipore Direct-Q was used. TALON resin was purchased from Clontech Laboratories Inc (California, USA). All protein purification chromatography experiments in
Figure PCTCN2022071726-appb-000013
was performed on a pure FPLC system (GE Healthcare, USA). dZTP was purchased from Trilink (California, USA). Other nucleotides were purchased from Sigma-Aldrich.
基因合成与克隆Gene Synthesis and Cloning
PurZ、dATase、EcPurB、SeGK和ApDUF550的密码子优化基因片段由Genewiz Inc.(苏州,中国)合成,并插入到pET-28a(+)中NdeI和BamHI限制性酶切位点之间,以表达具有N-末端His 6标签的蛋白。使用Gibson装配克隆方案,将CpPurZ、ApdATPase和ApDUF550也插入pET-28a(+)-HMT载体的SspI限制性位点。产生的质粒串联包含:His6-标签,麦芽糖结合蛋白(MBP)和TEV蛋白酶切割位点,随后是目的构建体,通过测序证实。通过PCR(正向引物CCAGAGCGGATCAGGAATGGAATTATCCTCACTGACCG;反向引物CCAATTGAGATCTGCCATATGTTATTTCAGCTCATCAACCATCG)扩增EcPurB(UniProt:P0AB89)基因,并用Gibson装配插入到pET-28a(+)载体中以表达具有N-末端His6标签的蛋白。 Codon-optimized gene fragments for PurZ, dATase, EcPurB, SeGK, and ApDUF550 were synthesized by Genewiz Inc. (Suzhou, China) and inserted between the NdeI and BamHI restriction sites in pET-28a(+) for expression Protein with an N-terminal His 6 tag. Using the Gibson assembly cloning protocol, CpPurZ, ApdATPase and ApDUF550 were also inserted into the SspI restriction site of the pET-28a(+)-HMT vector. The resulting plasmid tandem contains: His6-tag, maltose binding protein (MBP) and TEV protease cleavage site, followed by the construct of interest, confirmed by sequencing. The EcPurB (UniProt: POAB89) gene was amplified by PCR (forward primer CCAGAGCGGATCAGGAATGGAATTATCCTCACTGACCG; reverse primer CCAATTGAGATCTGCCATATGTTATTTCAGCTCATCAACCATCG) and inserted into pET-28a(+) vector using Gibson assembly to express the protein with an N-terminal His6 tag.
PurZ、PurB、dATPase、SeGK和ApDUF550的表达和纯化Expression and purification of PurZ, PurB, dATPase, SeGK and ApDUF550
用编码PurZ、PurB、dATPase、SeGK和ApDUF550基因的质粒转化大肠杆菌BL21(DE3)细胞,并接种在补充有50μg/mL卡那霉素的LB琼脂上。在37℃下在振荡培养箱中以220rpm在LB培养基(300mL)中生长转化体。当OD 600达到约0.8时,将温度降至18℃,并加入异丙基β-D-1-硫代吡喃半乳糖苷(IPTG)至最终浓度为0.5mM,以诱导目的蛋白的产生。16-20小时后,通过离心(6000×g,4℃下10分钟)收集细胞。 E. coli BL21 (DE3) cells were transformed with plasmids encoding the PurZ, PurB, dATPase, SeGK and ApDUF550 genes and plated on LB agar supplemented with 50 μg/mL kanamycin. Transformants were grown in LB medium (300 mL) at 37°C in a shaking incubator at 220 rpm. When the OD600 reached about 0.8, the temperature was lowered to 18°C and isopropyl β-D-1-thiogalactopyranoside (IPTG) was added to a final concentration of 0.5 mM to induce the production of the protein of interest. After 16-20 hours, cells were harvested by centrifugation (6000 xg, 10 min at 4°C).
将细胞重悬于15mL裂解缓冲液(50mM Tris-HCl、pH8.0、1mM苯基甲磺酰氟,0.2mg/mL溶菌酶,0.03%Triton X-100和0.02mg/mL DNase I)中。将细胞悬浮液在-80℃冰箱中冷冻,然后解冻,并在25℃水浴中孵育30分钟以允许细胞裂解。将1.5mL的11%硫酸链霉素(溶于水中)加入到细胞裂解物中,然后温和混合和离心(20000×g,在4℃下10分钟)。将上清液通过0.22μm过滤器过滤并装载到用缓冲液A(20mM HEPES(pH7.5)、5mMβ-巯基乙醇(BME)和0.2M KCl)预平衡的5mL TALON钴柱上。用10倍柱体积(CV)的缓冲液A洗涤柱,然后用含有150mM咪唑的5CV缓冲液A洗脱蛋白质。将洗脱液用2L缓冲液B(20mM HEPES(pH7.5)、0.1M KCl和1mM二硫苏糖醇)透析过夜去除咪唑,然后使用离心浓缩器(30K MWCO;Millipore)进行浓缩。使用NANODROP ONE(Thermo SCIENTIFIC)由280nm的吸光度计算纯化的蛋白质的浓度。在4-20%SDS聚丙烯酰胺梯度凝胶上检测具有商业蛋白标记(Genstar,深圳)的纯化蛋白,并通过考马斯染色观察。Cells were resuspended in 15 mL of lysis buffer (50 mM Tris-HCl, pH 8.0, 1 mM phenylmethanesulfonyl fluoride, 0.2 mg/mL lysozyme, 0.03% Triton X-100 and 0.02 mg/mL DNase I). The cell suspension was frozen in a -80°C freezer, then thawed, and incubated in a 25°C water bath for 30 minutes to allow cell lysis. 1.5 mL of 11% streptomycin sulfate (in water) was added to the cell lysate, followed by gentle mixing and centrifugation (20000 xg, 10 min at 4°C). The supernatant was filtered through a 0.22 μm filter and loaded onto a 5 mL TALON cobalt column pre-equilibrated with buffer A (20 mM HEPES (pH 7.5), 5 mM β-mercaptoethanol (BME) and 0.2 M KCl). The column was washed with 10 column volumes (CV) of buffer A, followed by protein elution with 5 CV of buffer A containing 150 mM imidazole. The eluate was dialyzed against 2 L of buffer B (20 mM HEPES (pH 7.5), 0.1 M KCl and 1 mM dithiothreitol) overnight to remove imidazole and then concentrated using a centrifugal concentrator (30 K MWCO; Millipore). The concentration of purified protein was calculated from absorbance at 280 nm using NANODROP ONE (Thermo SCIENTIFIC). Purified proteins with commercial protein markers (Genstar, Shenzhen) were detected on 4-20% SDS polyacrylamide gradient gels and visualized by Coomassie staining.
PurZ的紫外-可见分光光度法和比色法分析UV-Vis Spectrophotometric and Colorimetric Analysis of PurZ
将含有20mM HEPES(pH7.5)、2mM dGMP、1mMATP、2mM Mg 2+、5mMAsp-Na +和5μMSb/Sp/Vp/ApPurZ的50μL反应混合物在室温(RT)下孵育0-6h。使用NANODROP ONE(Thermo SCIENTIFIC)监测220至340nm的吸光度。使用比色磷钼酸盐测定定量磷酸根(13)。 A 50 μL reaction mixture containing 20 mM HEPES (pH 7.5), 2 mM dGMP, 1 mM ATP, 2 mM Mg2+ , 5 mM Asp-Na + and 5 μM Sb/Sp/Vp/ApPurZ was incubated at room temperature (RT) for 0-6 h. Absorbance from 220 to 340 nm was monitored using NANODROP ONE (Thermo SCIENTIFIC). Phosphate (13) was quantified using a colorimetric phosphomolybdate assay.
SbPurZ的底物特异性比色磷酸根测定Substrate-specific colorimetric phosphate assay for SbPurZ
将含有20mM HEPES(pH7.5)、2mM dGMP(dImp、GMP或IMP)、1mM ATP(GTP)、2mM Mg 2+、5mM Asp-Na +(不含Asp-Na +作为空白)和5μM SbPurZ的400μL反应混合物在室温下孵育1小时。使用比色磷钼酸盐测定定量磷酸根(13)。 Contain 20 mM HEPES (pH 7.5), 2 mM dGMP (dImp, GMP or IMP), 1 mM ATP (GTP), 2 mM Mg 2+ , 5 mM Asp-Na + (without Asp-Na + as blank) and 5 μM SbPurZ 400 μL of the reaction mixture was incubated for 1 hour at room temperature. Phosphate (13) was quantified using a colorimetric phosphomolybdate assay.
SbPurZ-EcPurB反应的ESI-MS/MS分析ESI-MS/MS Analysis of SbPurZ-EcPurB Reaction
将含有20mM Tris-HCl(pH8.0)、1mM dGMP、0.5mM ATP、2mM Mg 2+、5mM Asp-Na +和5μM SbPurZ的300μL反应混合物在室温下孵育1小时。另外制备省略SbPurZ或Asp-Na +的两个阴性对照。然后,将反应物施加到离心机浓缩器(3K MWCO;Millipore)。使用Q Exactive TM HF/UltiMate TM 3000RSLCnano(Thermo Fisher)仪器对流过物进行ESI-MS/MS分析。样品上样体积为5μL,上样速率为0.2mL/min。另外在包含5μM的EcPurB的情况下重复SbPurZ测定。 A 300 μL reaction mixture containing 20 mM Tris-HCl (pH 8.0), 1 mM dGMP, 0.5 mM ATP, 2 mM Mg 2+ , 5 mM Asp-Na + and 5 μM SbPurZ was incubated for 1 hour at room temperature. Two negative controls were additionally prepared omitting SbPurZ or Asp-Na + . The reaction was then applied to a centrifuge concentrator (3K MWCO; Millipore). ESI-MS/MS analysis of the flow through was performed using a Q Exactive HF/UltiMate 3000RSLCnano (Thermo Fisher) instrument. The sample loading volume was 5 μL, and the sample loading rate was 0.2 mL/min. The SbPurZ assay was additionally repeated with the inclusion of 5 μM of EcPurB.
PurZ活性对底物浓度的依赖性Dependence of PurZ activity on substrate concentration
使用Tecan M200读板器以15s间隔在室温下持续1-5分钟监测96孔板中的200μL反应混合物在287nm处的吸光度,200μL反应混合物含有50mM HEPES(pH8.5)、2mM Mg 2+、0.2-1μM的PurZ酶(Sb/Sp/Vp/ApPurZ),固定浓度的三种底物(5mMAsp-Na +、0.1mM ATP、dGMP)中的两种,并改变第三种底物500-0μM的Asp-Na +,125-0μMATP或60-0μM dGMP。 The absorbance at 287 nm of 200 μL of reaction mixture containing 50 mM HEPES (pH 8.5), 2 mM Mg 2+ , 0.2 mM in a 96-well plate was monitored using a Tecan M200 plate reader at 15 s intervals for 1-5 min at room temperature -1 μM of the PurZ enzyme (Sb/Sp/Vp/ApPurZ), fixed concentrations of two of the three substrates (5 mM Asp-Na + , 0.1 mM ATP, dGMP), and varied 500-0 μM of the third substrate Asp-Na + , 125-0 μM ATP or 60-0 μM dGMP.
SbPurZ活性对dIMP、IMP和GMP浓度的依赖性Dependence of SbPurZ activity on dIMP, IMP and GMP concentrations
使用TecanM200读板器以30s间隔在室温下持续3-4分钟监测96孔板中的200μL反应混合物在290nm(41)或287nm处的吸光度,200μL反应混合物含有50mM HEPES(pH8.5),2mM Mg 2+,5mM Asp-Na +,0.1mM ATP,100-0μM dIMP,IMP或GMP和5μM SbPurZ野生型或S15D突变酶。 Absorbance at 290 nm (41) or 287 nm of 200 μL of reaction mixture in 96-well plate was monitored at 30 s intervals for 3-4 min at room temperature using a Tecan M200 plate reader, 200 μL of reaction mixture containing 50 mM HEPES (pH 8.5), 2 mM Mg 2+ , 5 mM Asp-Na + , 0.1 mM ATP, 100-0 μM dIMP, IMP or GMP and 5 μM SbPurZ wild-type or S15D mutant enzyme.
dATPase反应的ESI-MS分析ESI-MS analysis of dATPase reactions
将含有10mM Tris-HCl(pH7.5)、1mM dATP或1mM dADP、0.5mM Co 2+和0.1-2μM Cp/ApdATPase的300μL反应混合物在室温下孵育1小时。阴性对照省略了Cp/ApdATPase。然后将反应混合物在沸水浴中孵育3分钟,并通过离心(18000×g,5分钟)除去沉淀的蛋白质。将上清液过滤,并上样至Agilent 6230 TOF LC/MS仪器(Agilent Technologies,CA,USA),用于无液相色谱的ESI-MS分析。样品上样体积为5μL,上样速率为0.2mL/min。 A 300 μL reaction mixture containing 10 mM Tris-HCl (pH 7.5), 1 mM dATP or 1 mM dADP, 0.5 mM Co 2+ and 0.1-2 μM Cp/ApdATPase was incubated for 1 hour at room temperature. Negative controls omitted Cp/ApdATPase. The reaction mixture was then incubated in a boiling water bath for 3 minutes and the precipitated protein was removed by centrifugation (18000 xg, 5 minutes). The supernatant was filtered and loaded onto an Agilent 6230 TOF LC/MS instrument (Agilent Technologies, CA, USA) for ESI-MS analysis without liquid chromatography. The sample loading volume was 5 μL, and the sample loading rate was 0.2 mL/min.
ApDUF550反应的ESI-MS分析ESI-MS Analysis of ApDUF550 Reaction
将含有10mM Tris-HCl(pH7.5)、0.5mM dATP或dGTP、2mM Co 2+和0.5μM ApDUF550的300μL反应混合物在室温下孵育0.5小时。阴性对照省略了ApDUF550。然后通过离心机浓缩器(3K MwCO;Millipore)去除酶来处理反应混合物,并上样至Agilent 6230 TOF LC/MS仪器(Agilent Technologies,CA,USA),用于无液相色谱柱的ESI-MS分析。样品上样体积为5μL,上样速率为0.2mL/min。 A 300 μL reaction mixture containing 10 mM Tris-HCl (pH 7.5), 0.5 mM dATP or dGTP, 2 mM Co 2+ and 0.5 μM ApDUF550 was incubated for 0.5 h at room temperature. Negative controls omitted ApDUF550. The reaction mixture was then processed by centrifuge concentrator (3K MwCO; Millipore) to remove enzymes and loaded onto an Agilent 6230 TOF LC/MS instrument (Agilent Technologies, CA, USA) for ESI-MS without LC column analyze. The sample loading volume was 5 μL, and the sample loading rate was 0.2 mL/min.
dATPase底物特异性dATPase substrate specificity
将含有10mM HEPES(pH7.5)、1mM(d)NTP、2mM Mg 2+、2mM Co 2+、0.2μM Cp/Sp/ApdATPase和0.3μM酵母无机焦磷酸酶1(IPP1)的100μL反应混合物在室温 下孵育1h。使用比色磷钼酸盐测定定量三磷酸根(13、42)。 A 100 μL reaction mixture containing 10 mM HEPES (pH 7.5), 1 mM (d)NTP, 2 mM Mg 2+ , 2 mM Co 2+ , 0.2 μM Cp/Sp/ApdATPase and 0.3 μM yeast inorganic pyrophosphatase 1 (IPP1) was placed in Incubate for 1 h at room temperature. Triphosphates were quantified using a colorimetric phosphomolybdate assay (13, 42).
用于Cp/Sp/ApdATPase反应的比色焦磷酸根和磷酸根测定Colorimetric Pyrophosphate and Phosphate Assays for Cp/Sp/ApdATPase Reactions
将含有10mM HEPES(pH7.5)、0.5mM dATP或dADP、2mM Mg 2+、2mM Co 2+、0.2μM Cp/Sp/ApdATPase和0.3μM无机焦磷酸酶1(IPP1)的100μL反应混合物在室温下孵育1h。使用比色磷钼酸盐测定定量焦磷酸根(13)。将含有10mM HEPES(pH7.5)、0.5mM dAMP、2mM Co 2+和1μM Cp/Sp/ApdATPase的100μL反应混合物在室温下孵育1小时。使用比色磷钼酸盐测定定量磷酸根(13、42)。 A 100 μL reaction mixture containing 10 mM HEPES (pH 7.5), 0.5 mM dATP or dADP, 2 mM Mg 2+ , 2 mM Co 2+ , 0.2 μM Cp/Sp/ApdATPase and 0.3 μM inorganic pyrophosphatase 1 (IPP1) was added at room temperature Incubate for 1 h. Pyrophosphate (13) was quantified using a colorimetric phosphomolybdate assay. A 100 μL reaction mixture containing 10 mM HEPES (pH 7.5), 0.5 mM dAMP, 2 mM Co 2+ and 1 μM Cp/Sp/ApdATPase was incubated for 1 hour at room temperature. Phosphate was quantified using a colorimetric phosphomolybdate assay (13, 42).
dATPase活性对底物浓度的依赖性Dependence of dATPase activity on substrate concentration
将含有10mM HEPES(pH7.5)、600-0μM dATP/dADP/dAMP、2mM Mg 2+、2mM Co 2+、0.04-0.8μM Cp/Sp/ApdATPase和0.3μM IPP1的100μL反应混合物在室温下孵育0-10分钟。使用比色磷钼酸盐测定定量三磷酸根、焦磷酸根和磷酸根(13、42)。 A 100 μL reaction mixture containing 10 mM HEPES (pH 7.5), 600-0 μM dATP/dADP/dAMP, 2 mM Mg 2+ , 2 mM Co 2+ , 0.04-0.8 μM Cp/Sp/ApdATPase and 0.3 μM IPP1 was incubated at room temperature 0-10 minutes. Triphosphate, pyrophosphate, and phosphate were quantified using a colorimetric phosphomolybdate assay (13, 42).
dATPase的金属选择性Metal selectivity of dATPases
将含有10mM HEPES(pH7.5)、0.5mM dAMP、2mM Co 2+(或其它二价金属离子)和1μM Sp/CpdATPase的100μL反应混合物在室温下孵育1小时。使用比色磷钼酸盐测定定量磷酸根(13、42)。 A 100 μL reaction mixture containing 10 mM HEPES (pH 7.5), 0.5 mM dAMP, 2 mM Co 2+ (or other divalent metal ions) and 1 μM Sp/CpdATPase was incubated for 1 hour at room temperature. Phosphate was quantified using a colorimetric phosphomolybdate assay (13, 42).
SbPurZ的定点诱变Site-directed mutagenesis of SbPurZ
通过定点诱变引入SbPurZ活性位点中的点突变,并通过测序证实。25μL PCR反应含有50ng pET28a(+)-SbPurZ质粒作为模板,0.4μM正向和反向引物(表6)和Fast Alteration DNA聚合酶(KM101来自TIANGEN,北京,中国)。用DpnI消化17个循环的PCR反应混合物以去除模板,然后转化到FDM感受态细胞(TIANGEN)中。如野生型PurZ所述方案表达和纯化SbPurZ突变蛋白。Point mutations in the SbPurZ active site were introduced by site-directed mutagenesis and confirmed by sequencing. A 25 μL PCR reaction contained 50 ng pET28a(+)-SbPurZ plasmid as template, 0.4 μM forward and reverse primers (Table 6) and Fast Alteration DNA polymerase (KM101 from TIANGEN, Beijing, China). The PCR reaction mixture was digested with DpnI for 17 cycles to remove template and then transformed into FDM competent cells (TIANGEN). SbPurZ muteins were expressed and purified as described for wild-type PurZ.
ApDUF550的底物特异性Substrate specificity of ApDUF550
将含有10mM HEPES(pH7.5)、0.5mM(d)NTP、2mM Mg 2+、2mM Co 2+、0.2μM ApDUF550(空白对照不添加ApDUF550)和0.3μM IPP1的100μL反应混合物在室温下孵育0.5小时。使用比色磷钼酸盐测定定量焦磷酸根形成(13)。 A 100 μL reaction mixture containing 10 mM HEPES (pH 7.5), 0.5 mM (d)NTP, 2 mM Mg 2+ , 2 mM Co 2+ , 0.2 μM ApDUF550 (no ApDUF550 added to the blank control) and 0.3 μM IPP1 was incubated at room temperature for 0.5 Hour. Pyrophosphate formation was quantified using a colorimetric phosphomolybdate assay (13).
ApDUF550的金属选择性Metal selectivity of ApDUF550
将含有10mM HEPES(pH7.5)、0.5mM dATP、2mM Co 2+(或其它金属离子)、0.3μM IPP1(2mM Mg 2+)和0.2μM HMT-ApDUF550的100μL反应混合物在室温下孵育0.5小时。使用比色磷钼酸盐测定定量焦磷酸根(13)。 A 100 μL reaction mixture containing 10 mM HEPES (pH 7.5), 0.5 mM dATP, 2 mM Co (or other metal ions), 0.3 μM IPP1 ( 2 mM Mg 2+ ) and 0.2 μM HMT-ApDUF550 was incubated for 0.5 h at room temperature . Pyrophosphate (13) was quantified using a colorimetric phosphomolybdate assay.
使用dATP或dZTP的ApPurZ DNA片段的PCR扩增PCR amplification of ApPurZ DNA fragments using dATP or dZTP
反应系统含有20ng pET28a(+)-ApPurZ、0.2mM dNTP(dATP)或0.2mM dNTP(dZTP(Trilink N-2003-1))、0.5μM正向引物(ATGAAGAAGGCGACCGTTATTT)、0.5μM反向引物(TCAAGCAATGTTTGATGATTTGTTAT)、1×Q5反应缓冲液和1μL Q5DNA聚合酶(NEB M0491L)。反应通过在T100热循环仪(BIO-RAD)上热循环进行,包括98℃下初始变性30秒和30循环(98℃ 10秒,60℃ 15秒和72℃ 30秒)。使用StarPrep凝胶提取试剂盒(GenStar)按照制造商的说明纯化PCR产物。然后使用NANODROP ONE(Thermo SCIENTIFIC)测量浓度,并通过琼脂糖凝胶电泳评价纯度。将400ng DNA加载到琼脂糖凝胶的每一泳道上。限制酶SspI(NEB R0132S)消 化的样品也包括在凝胶分析中。The reaction system contains 20ng pET28a(+)-ApPurZ, 0.2mM dNTP(dATP) or 0.2mM dNTP(dZTP(Trilink N-2003-1)), 0.5μM forward primer (ATGAAGAAGGCGACCGTTATTT), 0.5μM reverse primer (TCAAGCAATGTTTGATGATTTGTTAT) , 1×Q5 reaction buffer, and 1 μL of Q5 DNA polymerase (NEB M0491L). Reactions were performed by thermal cycling on a T100 thermal cycler (BIO-RAD) including initial denaturation at 98°C for 30 seconds and 30 cycles (98°C for 10 seconds, 60°C for 15 seconds and 72°C for 30 seconds). PCR products were purified using the StarPrep Gel Extraction Kit (GenStar) following the manufacturer's instructions. Concentrations were then measured using NANODROP ONE (Thermo SCIENTIFIC) and purity assessed by agarose gel electrophoresis. 400ng of DNA was loaded on each lane of the agarose gel. Samples digested with restriction enzyme SspI (NEB R0132S) were also included in the gel analysis.
Sp/ApPurZ-EcPurB-SeGK反应的ESI-MS分析ESI-MS Analysis of Sp/ApPurZ-EcPurB-SeGK Reaction
将含有20mM HEPES(pH8.5)、1mM dGMP、1mMATP、5mM Mg 2+、5mM Asp-Na +、5μMSp/ApPurZ和5μMEcPurB的300μL反应混合物在室温下孵育4小时或0.5小时。通过离心机浓缩器(3K MwCO;Millipore)从反应混合物去除酶,包含在流过物中的产物作为SeGK的底物。加入5μMSeGK、1mM ATP和5mM Mg 2+,并在室温下再孵育45分钟。将反应混合物再次施加至离心机浓缩器(3KMWCO;Millipore),保留蛋白质。然后将流过物上样至Agilent 6230TOF LC/MS仪器(Agilent Technologies,CA,USA),用于无液相色谱柱的ESI-MS分析。样品上样体积为5μL,用水以0.2mL/min的上样速率上样。 A 300 μL reaction mixture containing 20 mM HEPES (pH 8.5), 1 mM dGMP, 1 mM ATP, 5 mM Mg 2+ , 5 mM Asp-Na + , 5 μM Sp/ApPurZ and 5 μM EcPurB was incubated at room temperature for 4 hours or 0.5 hours. The enzyme was removed from the reaction mixture by a centrifuge concentrator (3K MwCO; Millipore) and the product contained in the flow-through served as a substrate for SeGK. 5 μM SeGK, 1 mM ATP and 5 mM Mg 2+ were added and incubated for an additional 45 minutes at room temperature. The reaction mixture was reapplied to a centrifuge concentrator (3KMWCO; Millipore), retaining the protein. The flow through was then loaded onto an Agilent 6230TOF LC/MS instrument (Agilent Technologies, CA, USA) for ESI-MS analysis without a LC column. The sample loading volume was 5 μL, and the sample was loaded with water at a loading rate of 0.2 mL/min.
不动杆菌噬菌体SH-Ab15497基因组DNA提取Extraction of genomic DNA from Acinetobacter phage SH-Ab15497
取对数生长期(OD 600 0.6-0.8)的鲍氏不动杆菌菌株15497的不动杆菌噬菌体SH-Ab15497(10 9PFU/mL),以体积比1:4接种至LB软琼脂覆盖层(0.7%琼脂)中在37℃培养箱中生长7-8小时。然后将裂解物收集在SM缓冲液(50mM Tris-HCl(pH7.5)、100mMNaCl、8mM MgSO 4和0.1g/L明胶)中,在4℃下,在100rpm摇床中过夜孵育。通过使用离心机以10000×g、4℃离心10分钟除去细胞碎片。使用λ噬菌体基因组DNA试剂盒(Zoman Biotech,ZP317-1)提取噬菌体SH-Ab 15497的基因组DNA。 Acinetobacter bacteriophage SH-Ab15497 (10 9 PFU/mL) of Acinetobacter baumannii strain 15497 in logarithmic growth phase (OD 600 0.6-0.8) was inoculated to LB soft agar overlay at a volume ratio of 1:4 ( 0.7% agar) in a 37°C incubator for 7-8 hours. Lysates were then collected in SM buffer (50 mM Tris-HCl (pH 7.5), 100 mM NaCl, 8 mM MgSO 4 and 0.1 g/L gelatin) and incubated overnight at 4°C in a 100 rpm shaker. Cell debris was removed by centrifugation at 10,000 xg, 4°C for 10 minutes using a centrifuge. The genomic DNA of phage SH-Ab 15497 was extracted using the lambda phage genomic DNA kit (Zoman Biotech, ZP317-1).
不动杆菌噬菌体SH-Ab 15497基因组DNA的HPLC-UV光谱和LC-MS/MS分析HPLC-UV spectroscopy and LC-MS/MS analysis of genomic DNA of Acinetobacter phage SH-Ab 15497
将来自不动杆菌噬菌体SH-Ab 15497的基因组DNA在中性条件下酶消化。简言之,含有5μg噬菌体基因组DNA或阳性对照(4μg含Z的ApPurZ PCR产物)或阴性对照(4μg含A的ApPurZ PCR产物)、2μL DNase I(NEB0303AA)、0.008单位磷酸二酯酶I(Sigma P3243)、2μL碱性磷酸酶(Takara 2120a)和15μL 10×反应缓冲液(500mM Tris-HCl(pH 7.5)、100mM NaCl、10mM MgCl 2和10mM ZnSO 4)的150μL混合物在37℃下孵育过夜,并施加到离心浓缩器(3K MwCO;Millipore)进行离心。使用Agilent 6420 Triple Quadrupole LC/MS仪器(Agilent Technologies,CA,USA)分析流过物。LC分离在Syncronis aQ(150mm*4.6mm,3μm)柱上进行,流速为0.5mL/min,室温。使用配制于水(溶剂A)和甲醇(溶剂B)中的10mM NH 4AC(pH4.6)作为流动相。使用0-12分钟的20-32%溶剂B作为梯度。样品上样体积为10μL,UV检测器设定在260nm。标准化合物包括商购的脱氧核苷(dA、dT、dC和dG)和自制的dZ(碱性磷酸酶(Takara 2120a)的dZTP水解产物,在37℃下孵育1小时,然后用离心保留酶)。上样2μL每一种标准化合物(约0.1μg)。使用配备有相同LC柱的Q Exactive TM HF/UltiMate TM3000RSLCnano(Thermo Fisher)仪器,使用相同洗脱方案,以0.4mL/min的流速,对经酶消化的噬菌体基因组DNA样品(10μL)进行ESI-MS/MS分析。质谱检测在正电喷雾电离(ESI)模式下进行。 Genomic DNA from Acinetobacter phage SH-Ab 15497 was enzymatically digested under neutral conditions. Briefly, containing 5 μg phage genomic DNA or positive control (4 μg ApPurZ PCR product with Z) or negative control (4 μg ApPurZ PCR product with A), 2 μL DNase I (NEB0303AA), 0.008 units of phosphodiesterase I (Sigma) P3243), 2 μL of alkaline phosphatase (Takara 2120a) and 15 μL of a 150 μL mixture of 10× reaction buffer (500 mM Tris-HCl (pH 7.5), 100 mM NaCl, 10 mM MgCl 2 and 10 mM ZnSO 4 ) were incubated overnight at 37 °C, And applied to a centrifugal concentrator (3K MwCO; Millipore) for centrifugation. The flow through was analyzed using an Agilent 6420 Triple Quadrupole LC/MS instrument (Agilent Technologies, CA, USA). LC separations were performed on a Syncronis aQ (150 mm*4.6 mm, 3 μm) column at a flow rate of 0.5 mL/min, room temperature. 10 mM NH4AC (pH 4.6 ) in water (solvent A) and methanol (solvent B) was used as mobile phase. A gradient of 20-32% solvent B from 0-12 minutes was used. The sample loading volume was 10 μL and the UV detector was set at 260 nm. Standard compounds include commercially available deoxynucleosides (dA, dT, dC, and dG) and homemade dZ (dZTP hydrolyzate of alkaline phosphatase (Takara 2120a), incubated at 37°C for 1 hr, then centrifuged to retain the enzyme) . 2 μL of each standard compound (approximately 0.1 μg) was loaded. ESI-enzyme-digested phage genomic DNA samples (10 μL) were subjected to ESI-lysis using a Q Exactive HF/UltiMate 3000RSLCnano (Thermo Fisher) instrument equipped with the same LC column using the same elution protocol at a flow rate of 0.4 mL/min. MS/MS analysis. Mass detection was performed in positive electrospray ionization (ESI) mode.
不动杆菌噬菌体SH-Ab 15497基因组DNA和含Z的ApPurZ DNA的限制性内切酶消化Restriction endonuclease digestion of Acinetobacter phage SH-Ab 15497 genomic DNA and Z-containing ApPurZ DNA
根据制造商的说明书,在20μL反应混合物中,对0.2-0.3μg噬菌体基因组DNA或含Z的ApPurZ PCR产物进行限制性内切酶消化。Restriction endonuclease digestion was performed on 0.2-0.3 μg of phage genomic DNA or Z-containing ApPurZ PCR product in a 20 μL reaction mixture according to the manufacturer’s instructions.
纳米孔测序Nanopore sequencing
用SQK-LSK109(2019版)构建从不动杆菌噬菌体SH-Ab 15497提取的天然DNA 和含Z/A的ApPurZ PCR产物文库。然后将文库装载到具有超过1100个活性单孔的流动池FLO-MIN106(R9.4,Oxford Nanopore Technologies)中。MinION Mk1B测序仪用于测序。在MinKNOW 3.6.5中集成的Guppy 3.2.10用于碱基判定。A library of native DNA extracted from Acinetobacter phage SH-Ab 15497 and a Z/A-containing ApPurZ PCR product library was constructed with SQK-LSK109 (version 2019). The library was then loaded into a flow cell FLO-MIN106 (R9.4, Oxford Nanopore Technologies) with over 1100 active single wells. MinION Mk1B sequencer was used for sequencing. Guppy 3.2.10 integrated in MinKNOW 3.6.5 was used for base calling.
纳米孔测序数据质量分析Nanopore sequencing data quality analysis
从最初收集的噬菌体DNA样品200537个读值,发明人根据BLASTn结果(e值截止值:10 -20)将每个读值指定为噬菌体或宿主。结果显示,将58611个读值映射至噬菌体SH-Ab 15497,将141926个读值映射至其宿主。然后基于BLASTn比对计算读值同一性。Guppy 3.2.10的总结文件得到读值Q评分。用与噬菌体样品相同的方法计算含Z/A的ApPurZ PCR产物的测序数据的质量分析。 From the initially collected phage DNA sample of 200537 reads, the inventors assigned each read as either phage or host based on the BLASTn results (e-value cutoff: 10-20 ). The results showed that 58611 reads were mapped to phage SH-Ab 15497 and 141926 reads were mapped to its host. Read identities are then calculated based on BLASTn alignments. Summary files for Guppy 3.2.10 get read Q-scores. The quality analysis of the sequencing data of the ApPurZ PCR products containing Z/A was calculated in the same way as for the phage samples.
生成含Z的六聚体的映射表Generate a mapping table of Z-containing hexamers
通过ont_fast5_api从fast5文件中提取每个读值的原始信号,并去除长的(长于100)低方差区域。从上述BLASTn结果中提取用于每个读值的噬菌体DNA序列的相应NGS形式。然后使用cwDTW(21)将原始信号与噬菌体基因组DNA序列比对。基于比对,产生读值中的每个六聚体的cwDTW得分(归一化原始信号和归一化预期信号之间的绝对差)(21)。基于六聚体的cwDTW得分,提取所有读值中每个六聚体的中值归一化信号以产生定制的映射表。然后由cwDTW使用定制的映射表来重新生成新的比对文件。通过反复经过上述过程3轮,噬菌体读值的cwDTW得分分布类似于宿主读值,表明最终映射表的有效性。The raw signal for each read was extracted from the fast5 file via ont_fast5_api and long (longer than 100) low variance regions were removed. The corresponding NGS version of the phage DNA sequence used for each read was extracted from the BLASTn results above. The original signal was then aligned to the phage genomic DNA sequence using cwDTW (21). Based on the alignment, a cwDTW score (absolute difference between normalized original signal and normalized expected signal) was generated for each hexamer in the reads (21). Based on the cwDTW scores of the hexamers, the median normalized signal for each hexamer across all reads was extracted to generate a customized mapping table. The custom mapping table is then used by cwDTW to regenerate a new alignment file. By repeatedly going through the above process for 3 rounds, the cwDTW score distribution of the phage reads was similar to that of the host reads, indicating the validity of the final mapping table.
结果、分析与讨论Results, Analysis and Discussion
Kirnos等在1977年报道了在蓝细菌噬菌体S-2L的基因组中A被Z完全取代(1)。当Z与T碱基对时,形成三个氢键(1,2)。与其它类型的核碱基修饰(3)的不同之处在于,Z在改变Watson-Crick碱基配对中的独特作用,包括改变双链DNA的物理、化学和机械性质方面(2,4-6)。然而,关于含Z基因组的合成所需的酶的生物化学表征,尚未有任何报道。了解含Z基因组的生物合成将有助于研究含Z基因组的物种分布。尽管在碳质陨石中明确地鉴定了Z碱基(7)(因此有意见认为Z是早期地球上出现生命所需的有机化合物的重要来源),但是迄今为止,噬菌体S-2L是已知在其基因组中具有Z碱基的唯一生物体,。Kirnos et al. in 1977 reported the complete replacement of A by Z in the genome of the cyanobacterial phage S-2L (1). When Z base pairs with T, three hydrogen bonds (1, 2) are formed. It differs from other types of nucleobase modifications (3) in the unique role of Z in altering Watson-Crick base pairing, including altering the physical, chemical, and mechanical properties of double-stranded DNA (2, 4-6). ). However, there has not been any report on the biochemical characterization of the enzymes required for the synthesis of Z-containing genomes. Understanding Z-genome-containing biosynthesis will help to study the distribution of Z-genome-containing species. Although the Z base (7) has been unequivocally identified in carbonaceous meteorites (hence the opinion that Z is an important source of organic compounds required for the emergence of life on early Earth), so far, phage S-2L is known to be in The only organism with a Z base in its genome, .
发明人发现并首次报道,在嘌呤生物合成途径中,S-2L基因组含有编码一种腺苷酸琥珀酸酯合成酶(PurA)(8)的同源物的开放阅读框,这种同源物在本申请中称为PurZ(图4)。已知在许多生物体中PurA和腺苷酸琥珀酸裂解酶(PurB)催化肌苷5’-单磷酸(IMP)向腺苷5’-单磷酸(AMP)的转化(图1中A图)(9)。本申请提出并证实PurZ参与类似的反应以提供Z核苷酸(图1中A图)。The inventors discovered and reported for the first time that in the purine biosynthesis pathway, the S-2L genome contains an open reading frame encoding a homologue of adenylate succinate synthase (PurA) (8), which Referred to in this application as PurZ (Figure 4). PurA and adenylate succinate lyase (PurB) are known to catalyze the conversion of inosine 5'-monophosphate (IMP) to adenosine 5'-monophosphate (AMP) in many organisms (Panel A in Figure 1) (9). The present application proposes and demonstrates that PurZ participates in a similar reaction to provide Z nucleotides (panel A in Figure 1).
为了验证发明人的设想,发明人首先使用生物信息学分析和鉴定各种噬菌体的PurZ,随后进行重组产生和表征(图1中B图和表1)。构建了S-2L PurZ(CpPurZ)及其最近同源物SbPurZ的同源模型(图1中C图和表1)。然后,将PurZ的活性位点与大肠杆菌PurA(EcPurA)(10)的活性位点进行比较。PurA的底物是IMP,鸟苷5’-三磷酸(GTP)和Asp,并且PurA催化GTPγ-磷酸向IMP的转移,随后用天冬氨酸置换磷酸以形成腺苷酸琥珀酸酯(9)。PurA GDxxKG基序中的催化残基Asp13(D)在PurZ中被Ser(S)残基替代(图1中D和E图,以及图6)。分子建模表明,用体积较小的Ser替换Asp可以容纳底物的另外的2-氨基(图1中D图),并且还可能改变催化机制(图7),因为在PurA中Asp从IMP中抽取了质子(9-11)。多重序列比对显示,尽管Asp13在PurA中高度保守,但Asp→Ser的取代在数十个推定的PurZ序列中是共有的特征(图1中E图以及图 6、图8)。此外,用于与IMP核糖的2’-羟基形成氢键的保守R303(EcPurA编号)在Cp/SbPurZ中被脂肪族Leu278/279替代(图1中E图以及图6),这与底物为脱氧核糖核苷酸(而不是核糖核苷酸)是相符合的。To test the inventors' hypothesis, the inventors first used bioinformatics to analyze and identify PurZ of various phages, followed by recombinant production and characterization (panel B in Figure 1 and Table 1). A homology model was constructed for S-2L PurZ (CpPurZ) and its closest homologue, SbPurZ (Panel C in Figure 1 and Table 1). Then, the active site of PurZ was compared with that of Escherichia coli PurA (EcPurA) (10). PurA's substrates are IMP, guanosine 5'-triphosphate (GTP) and Asp, and PurA catalyzes the transfer of GTPγ-phosphate to IMP followed by displacement of phosphate with aspartate to form adenylate succinate (9) . The catalytic residue Asp13(D) in the PurA GDxxKG motif was replaced by a Ser(S) residue in PurZ (Figure 1, panels D and E, and Figure 6). Molecular modeling suggests that replacing Asp with a smaller Ser can accommodate the additional 2-amino group of the substrate (panel D in Fig. 1) and may also alter the catalytic mechanism (Fig. 7), since Asp is removed from the IMP in PurA Protons (9-11) were extracted. Multiple sequence alignment revealed that although Asp13 is highly conserved in PurA, the Asp→Ser substitution is a common feature in dozens of putative PurZ sequences (panel E in Fig. 1 and Figs. 6 and 8). In addition, the conserved R303 (EcPurA numbering) for hydrogen bonding with the 2'-hydroxyl group of IMP ribose was replaced by aliphatic Leu278/279 in Cp/SbPurZ (Fig. 1, E panel and Fig. 6), which is similar to the substrate Deoxyribonucleotides (rather than ribonucleotides) are compatible.
发明人筛选了多种PurZ同系物,从中选取四种(Ap/Sp/Vp/SbPurZ,SEQ ID NOs:1-4)(图9、图10和表1),并以2’-脱氧鸟苷5’-单磷酸(dGMP),腺嘌呤5’-三磷酸(ATP)和Asp作为底物(图1中A图)检测了它们的酶活性。结果与发明人的同源模型一致(图1中D和F图)。Ap、Sp和VpPurZ来自噬菌体分离物,宏基因组重叠群中的SbPurZ也应当是噬菌体来源(表2)。The inventors screened a variety of PurZ homologs, and selected four (Ap/Sp/Vp/SbPurZ, SEQ ID NOs: 1-4) (Figure 9, Figure 10 and Table 1), and identified them with 2'-deoxyguanosine. 5'-monophosphate (dGMP), adenine 5'-triphosphate (ATP) and Asp were used as substrates (panel A in Figure 1) to detect their enzymatic activities. The results were consistent with the inventors' homology model (D and F panels in Figure 1). Ap, Sp and VpPurZ were from phage isolates, and SbPurZ in metagenomic contigs should also be of phage origin (Table 2).
将PurZ与饱和量的底物孵育显示吸收光谱时间依赖性变化,表明在核碱基处发生反应。通过从产物的光谱中减去起始材料的光谱获得的UV-Vis差异光谱在287nm处显示最大值(图10中A-E图),这符合dGMP(λ 最大=252nm)(12)向类似2-氨基-2'-脱氧腺苷5’-单磷酸(dZMP)(λ 最大=247,280nm)物质(2)的转化。在反应期间,通过比色磷钼酸盐测定证实(图2A和图10中F)(13),在287nm处吸光度(ΔA 287)的时间依赖性增加与磷酸根的释放成比例。 Incubation of PurZ with a saturating amount of substrate showed a time-dependent change in the absorption spectrum, indicating that the reaction occurred at the nucleobase. The UV-Vis difference spectrum obtained by subtracting the spectrum of the starting material from the spectrum of the product showed a maximum at 287 nm (AE panel in Fig. 10), which is consistent with dGMP (λ max = 252 nm) (12) towards a similar 2- Conversion of amino-2'-deoxyadenosine 5'-monophosphate (dZMP) (λ max = 247,280 nm) of species (2). During the reaction, the time-dependent increase in absorbance at 287 nm (ΔA 287 ) was proportional to the release of phosphate, as demonstrated by colorimetric phosphomolybdate assays (FIG. 2A and FIG. 10F) (13).
用鸟苷5’-单磷酸(GMP)或IMP取代dGMP或用其它核糖核苷5’-三磷酸(NTP)取代ATP没有观察到反应,证实了反应的底物特异性。当dIMP与ATP和Asp一起使用时,酶保留38%的活性(图2中A图)。反应的最佳pH为约9.5(图10中G图)。使用连续分光光度测定监测ΔA 287来研究PurZ的Michaelis-Menten动力学(图10中H图)。对于四种酶,k cat的范围为2.3-16.5min -1,对于dGMP和ATP的K M分别为1.6-5.1和4.1-21.1μM(图11和表3)。dGMP的表观K M远低于先前报道的细菌中dGMP的细胞内浓度(约50μM)(14)。dIMP的k cat/K M显著(15倍)低于生理底物dGMP的k cat/K MSubstitution of dGMP with guanosine 5'-monophosphate (GMP) or IMP or substitution of ATP with other ribonucleoside 5'-triphosphates (NTP) was not observed, confirming the substrate specificity of the reaction. When dIMP was used with ATP and Asp, the enzyme retained 38% activity (panel A in Figure 2). The optimum pH for the reaction was about 9.5 (Panel G in Figure 10). The Michaelis-Menten kinetics of PurZ was investigated using serial spectrophotometry monitoring of ΔA 287 (panel H in Figure 10). The kcat ranged from 2.3-16.5 min -1 for the four enzymes, and the KMs were 1.6-5.1 and 4.1-21.1 [mu] M for dGMP and ATP, respectively (Figure 11 and Table 3). The apparent KM of dGMP is much lower than the previously reported intracellular concentration of dGMP in bacteria (about 50 μM ) (14). The k cat /KM of dIMP was significantly (15-fold) lower than that of the physiological substrate dGMP .
通过电喷雾电离串联质谱(ESI-MS/MS,图2中B-D和图10中I-L)确认PurZ的反应中间体和产物。SbPurZ与dGMP、ATP和Asp共同孵育导致在m/z=426和461处出现两个新的峰,分别对应于腺嘌呤5’-二磷酸(ADP)和2-氨基脱氧腺苷酸琥珀酸酯(ADAS)(图2中B和C)。省略Asp导致在m/z=426处出现一个新的峰,这与ADP和推定的反应中间体6-磷酰基-dGMP的质量相匹配(图10中I)。在不存在Asp的情况下,没有检测到磷酸根释放,这符合其在6-磷酰基-dGMP中被捕获(图10中J)。在IMP、GTP和氨乙钠(hadacidin)存在下结晶的EcPurA的结构的复合物中观察到类似的中间体6-磷酰基-IMP(10)。The reaction intermediates and products of PurZ were confirmed by electrospray ionization tandem mass spectrometry (ESI-MS/MS, B-D in Figure 2 and I-L in Figure 10). Co-incubation of SbPurZ with dGMP, ATP and Asp resulted in two new peaks at m/z = 426 and 461, corresponding to adenine 5'-diphosphate (ADP) and 2-aminodeoxyadenosuccinate, respectively (ADAS) (B and C in Figure 2). Omitting Asp resulted in a new peak at m/z=426, which matched the mass of ADP and the putative reaction intermediate 6-phosphoryl-dGMP (I in Figure 10). In the absence of Asp, no phosphate release was detected, consistent with its capture in 6-phosphoryl-dGMP (Figure 10, J). A similar intermediate, 6-phosphoryl-IMP (10), was observed in a complex of structures of EcPurA crystallized in the presence of IMP, GTP and hadacidin.
由于蓝细菌噬菌体S-2L不编码PurB的同系物,发明人假设随后ADAS向dZMP的转化由来自宿主的PurB催化(图3中A和B)。SbPurZ反应产物与重组EcPurB(图9中E)的孵育导致ADAS峰消失并在m/z=345处出现新的峰,对应于dZMP(图2中D),证明细菌PurB事实上有dZMP合成的能力。Since cyanobacterial phage S-2L does not encode a homolog of PurB, the inventors hypothesized that the subsequent conversion of ADAS to dZMP is catalyzed by PurB from the host (A and B in Figure 3). Incubation of the SbPurZ reaction product with recombinant EcPurB (Fig. 9, E) resulted in the disappearance of the ADAS peak and the appearance of a new peak at m/z=345, corresponding to dZMP (Fig. 2, D), proving that bacterial PurB is in fact synthesized by dZMP ability.
为了研究底物选择性,发明人进行SbPurZ中底物相互作用残基向EcPurA中相应残基的诱变。所有突变在不同程度上损害PurZ的活性(表4)。值得注意的是,S15D突变彻底消除了SbPurZ的活性,符合S15在容纳dGMP的2-氨基中的作用(图1中D)。T274G突变体显示出对Asp的K M增加271倍,符合T274在Asp结合中的作用(图7)。与ATP中腺嘌呤相互作用的三个残基的突变体N306T、F307K和N309D导致对ATP的K M增加89-450倍(表4),同时它们中没有一个与GTP有反应活性。类似地,没有一个突变体显示出对IMP的活性。dZMP必须被磷酸化以形成2-氨基-2'-脱氧腺苷5’-三磷酸(dZTP),然后可以通过聚合酶并入噬菌体基因组中。该反应有效地被肠道沙门氏菌(Salmonella enterica)GMP激酶(SeGK)催化,这表明来自细菌宿主的酶可以实现该功能(图3中B,图9中J和图12)。 To study substrate selectivity, the inventors performed mutagenesis of substrate-interacting residues in SbPurZ to the corresponding residues in EcPurA. All mutations impair PurZ activity to varying degrees (Table 4). Notably, the S15D mutation completely abolished the activity of SbPurZ, consistent with a role for S15 in accommodating the 2-amino group of dGMP (D in Figure 1). The T274G mutant showed a 271-fold increase in KM for Asp, consistent with a role for T274 in Asp binding (Figure 7). The three-residue mutants N306T, F307K and N309D that interact with adenine in ATP resulted in an 89-450 -fold increase in KM for ATP (Table 4), while none of them were reactive with GTP. Similarly, none of the mutants showed activity against IMP. dZMP must be phosphorylated to form 2-amino-2'-deoxyadenosine 5'-triphosphate (dZTP), which can then be incorporated into the phage genome by a polymerase. This reaction was efficiently catalyzed by the Salmonella enterica GMP kinase (SeGK), suggesting that enzymes from the bacterial host can perform this function (Figure 3B, Figure 9J and Figure 12).
为了研究其它酶是否参与含Z基因组生物合成,发明人检查了PurZ的基因组环境, 并鉴定了一种DNA聚合酶,一种含HD结构域的水解酶样酶和一种含DUF550结构域的蛋白质(图3中A)。To investigate whether other enzymes are involved in Z-containing genome biosynthesis, the inventors examined the genomic environment of PurZ and identified a DNA polymerase, an HD domain-containing hydrolase-like enzyme and a DUF550 domain-containing protein (A in Figure 3).
HD结构域酶(本申请也称为dATPase)出现在含有PurZ的20种噬菌体的基因组中(图3中A)。这些HD结构域酶具有高度保守的金属和配体结合口袋,但是它们的序列高度多样性(图3中A和图13)。发明人制备重组Cp/Ap/Sp HD酶(SEQ ID NOs:5-7)(图9中F-H)。尽管它们仅具有24-34%的序列同一性(图3中A),但是所有三种酶都表现出2’-脱氧腺嘌呤5’-三磷酸三磷酸水解酶(dATP酶)活性,能催化2’-脱氧腺嘌呤5’-三磷酸(dATP)水解成2’-脱氧腺嘌呤(dA)和三磷酸,使用Co 2+作为二价金属辅因子时获得最高的活性(图3中C,图14-16,表观k cat=0.5-5.2s -1,表观K M=6.5-74.8μM)。dATP的表观K M在先前报道的细菌中细胞内dATP浓度的范围内(15)。该酶对dATP高度特异,对NTP和其它dNTP的水解活性低得多(图3中C)。该酶还催化2’-脱氧腺嘌呤5'-二磷酸(dADP)和2’-脱氧腺嘌呤5’-单磷酸(dAMP)水解成dA,分别释放焦磷酸根和磷酸根(图15、16)。因此,dATPase可以通过特异性地从宿主的核苷酸池中除去dATP及其前体dADP来促进含Z基因组合成(16),从而防止A被并入噬菌体基因组中。 HD domain enzymes (also referred to herein as dATPases) are present in the genomes of 20 PurZ-containing phages (Fig. 3, A). These HD domain enzymes have highly conserved metal and ligand binding pockets, but their sequences are highly diverse (Figure 3A and Figure 13). The inventors prepared recombinant Cp/Ap/Sp HD enzymes (SEQ ID NOs: 5-7) (FH in Figure 9). Although they share only 24-34% sequence identity (Fig. 3, A), all three enzymes exhibit 2'-deoxyadenine 5'-triphosphate triphosphate hydrolase (dATPase) activity that catalyzes 2'-deoxyadenine 5'-triphosphate (dATP) was hydrolyzed to 2'-deoxyadenine (dA) and triphosphate, and the highest activity was obtained when Co was used as a divalent metal cofactor (C in Figure 3, Figures 14-16, apparent k cat = 0.5-5.2 s -1 , apparent KM = 6.5-74.8 μM ). The apparent KM of dATP is in the range of the previously reported intracellular dATP concentration in bacteria (15). The enzyme is highly specific for dATP with much lower hydrolytic activity for NTP and other dNTPs (C in Figure 3). The enzyme also catalyzes the hydrolysis of 2'-deoxyadenine 5'-diphosphate (dADP) and 2'-deoxyadenine 5'-monophosphate (dAMP) to dA, releasing pyrophosphate and phosphate, respectively (Figures 15, 16). ). Thus, dATPases can promote Z-containing gene group synthesis by specifically removing dATP and its precursor dADP from the host's nucleotide pool (16), thereby preventing the incorporation of A into the phage genome.
含DUF550结构域蛋白也是有意义的,因为它与PurZ共存。重组ApDUF550(SEQ ID NO:8)显示dATP和2’-脱氧鸟苷5’-三磷酸(dGTP)焦磷酸水解酶活性,分别催化dATP/dGTP水解成焦磷酸根和dAMP/dGMP,使用Co 2+作为二价金属辅因子时获得最高的活性,对NTP和嘧啶dNTP几乎没有或根本没有活性(图3中D和图17)。因此,这种含DUF550酶可以起到提供dGMP作为PurZ底物的作用,提高dZTP水平,同时消耗dATP以进一步促进Z的并入(图3中B)。 A DUF550 domain-containing protein is also of interest because it coexists with PurZ. Recombinant ApDUF550 (SEQ ID NO: 8) displays dATP and 2' -deoxyguanosine 5'-triphosphate (dGTP) pyrophosphohydrolase activity, catalyzing the hydrolysis of dATP/dGTP to pyrophosphate and dAMP/dGMP, respectively, using Co + The highest activity was obtained as a divalent metal cofactor, with little or no activity against NTPs and pyrimidine dNTPs (D in Figure 3 and Figure 17). Thus, this DUF550-containing enzyme may act to provide dGMP as a PurZ substrate, increasing dZTP levels, while depleting dATP to further facilitate Z incorporation (Fig. 3B).
对于PurZ和其它参与dZTP生物合成和基因组并入的基因的鉴定为研究含Z基因组在自然界中的出现提供了基础。预测的PurZ序列包括来自噬菌体分离物的60个序列以及来自宏基因组中噬菌体重叠群的13个序列(图1中B和表1)。含PurZ噬菌体主要属于短尾噬菌体科(Podoviridae)和长尾噬菌体科(Siphoviridae)家族(图1中B)。在后基因组时代,核苷酸含量或碱基组成的化学测定不再是常规方法,这些噬菌体在DNA中含有修饰嘌呤的可能性可能被忽视了。本申请的发明人开发了证实Z的基因组并入的方案,并选择其中一种噬菌体裂解不动杆菌(Acinetobacter)噬菌体SH-Ab 15497,用于进一步研究(17)。制备噬菌体DNA并用DNase I、磷酸二酯酶I和碱性磷酸酶的组合消化,然后进行LC-UV光谱测定和LC-MS/MS分析(图4中A和B)。与正常DNA(本文也称为A-DNA,相对于Z-DNA而言)相比,噬菌体DNA和Z-DNA(阳性对照,使用Q5聚合酶、用dZTP代替dNTP混合物中的dATP对ApPurZ的PCR产物,图18)的消化产物具有可忽略量的dA(图4中A)。使用消光系数对UV峰进行定量,得到符合Chargraff规则的脱氧核苷的摩尔比,dZ与dT的比在0.99,dC与dG的比在1.05。检测到与dZ核苷和Z碱基匹配的阳离子质量(图4中B和图19)。这些结果表明,在SH-Ab15497的基因组中,Z几乎完全地(也可能是完全地)取代A与T配对,并且含Z基因组不限于蓝细菌噬菌体S-2L,并且含Z基因组的出现可能比以前所理解的更广泛。The identification of PurZ and other genes involved in dZTP biosynthesis and genome incorporation provides a basis for studying the occurrence of Z-containing genomes in nature. Predicted PurZ sequences included 60 sequences from phage isolates and 13 sequences from phage contigs in the metagenome (Figure 1B and Table 1). PurZ-containing phages mainly belong to the Podoviridae and Siphoviridae families (B in Figure 1). In the post-genomic era, where chemical determination of nucleotide content or base composition is no longer routine, the possibility that these phages contain modified purines in their DNA may be overlooked. The inventors of the present application developed a protocol to demonstrate the genomic incorporation of Z and selected one of the phages, the Acinetobacter phage SH-Ab 15497, for further study (17). Phage DNA was prepared and digested with a combination of DNase I, phosphodiesterase I, and alkaline phosphatase, followed by LC-UV spectrometry and LC-MS/MS analysis (Figure 4, A and B). PCR of ApPurZ with bacteriophage DNA and Z-DNA (positive control, using Q5 polymerase, dZTP instead of dATP in dNTP mix) compared to normal DNA (also referred to herein as A-DNA versus Z-DNA) product, Figure 18) of the digestion product had negligible amounts of dA (Figure 4, A). The UV peaks were quantified using the extinction coefficient to obtain the molar ratio of deoxynucleosides according to Chargraff's rule, the ratio of dZ to dT was 0.99, and the ratio of dC to dG was 1.05. Cationic masses matching the dZ nucleosides and Z bases were detected (B in Figure 4 and Figure 19). These results suggest that in the genome of SH-Ab15497, Z almost completely (and possibly completely) replaces A with T pairing, and that Z-containing genomes are not limited to cyanobacterial phage S-2L, and that the occurrence of Z-containing genomes is more likely than wider as previously understood.
由于噬菌体DNA提取物或多或少地会含有不同量的宿主DNA片段,发明人进行了粗提物的纳米孔测序,以显示噬菌体读值的质量(通过Q-分数、读值同一性和cwDTW比对分数定量)与Z-DNA对照的质量相似,而宿主读值的质量与正常DNA的质量相似(图4中C-E和图20、21)(18-20)。纳米孔信号被分配噬菌体和宿主DNA每者约100M读值,这允许发明人对数据进行统计学分析,并且得到以下结论:噬菌体DNA几乎无腺嘌呤,而Z不大可能被并入宿主DNA中。Since phage DNA extracts will contain more or less varying amounts of host DNA fragments, the inventors performed nanopore sequencing of crude extracts to show the quality of phage reads (by Q-score, read identity and cwDTW Alignment fraction quantification) was similar in quality to the Z-DNA control, while the quality of the host reads was similar to that of normal DNA (C-E in Figure 4 and Figures 20, 21) (18-20). The nanopore signal was assigned approximately 100M reads each for phage and host DNA, which allowed the inventors to perform a statistical analysis of the data and conclude that phage DNA is virtually free of adenine, while Z is unlikely to be incorporated into host DNA .
多种噬菌体DNA修饰的功能是避免被宿主的限制性酶攻击(3)。据报道,S-2L的基因组DNA对大多数限制性酶具有抗性(21,22)。为了证实这对于含Z基因组的一般适用 性,发明人研究了SH-Ab 15497的基因组DNA和Z-PCR产物对限制性内切酶消化的易感性。发明人观察到:来自两种来源的含Z取代的DNA对所有识别位点含有一个或多个A的酶的消化具有抗性,包括宿主内源DNA限制酶的两个紧密同源物(EcoRI和PstI)。唯一的例外是TaqI,其已知能够耐受多种DNA修饰(21)。相比之下,HaeIII和Sau96I都识别仅含GC的序列,可以容易地消化噬菌体DNA和Z-DNA对照(图4中F和G、图22、表5)。因此,由含Z基因组赋予的一个进化优势可以是在多种细菌中逃避限制性酶消化的能力。发明人注意到,对于用Sau3AI消化的噬菌体DNA,未观察到明显的降解,而对于Sau3AI,噬菌体DNA含有超过200个识别位点,这显然与噬菌体基因组中完全以Z替换A相一致(图22)。发明人进一步证实,含有PurZ的噬菌体广泛分布在地球上(数据未示出)。Various phage DNA modifications function to avoid attack by host restriction enzymes (3). The genomic DNA of S-2L was reported to be resistant to most restriction enzymes (21, 22). To demonstrate the general applicability of this to Z-containing genomes, the inventors investigated the susceptibility of the genomic DNA and Z-PCR product of SH-Ab 15497 to restriction endonuclease digestion. The inventors observed that DNA containing Z substitutions from both sources was resistant to digestion by all enzymes containing one or more A's in the recognition site, including the two close homologues of the host's endogenous DNA restriction enzymes (EcoRI). and PstI). The only exception is TaqI, which is known to tolerate various DNA modifications (21). In contrast, both HaeIII and Sau96I recognized GC-only sequences and could readily digest phage DNA and Z-DNA controls (F and G in Figure 4, Figure 22, Table 5). Thus, one evolutionary advantage conferred by Z-containing genomes may be the ability to escape restriction enzyme digestion in a variety of bacteria. The inventors noted that no significant degradation was observed for phage DNA digested with Sau3AI, whereas for Sau3AI the phage DNA contained more than 200 recognition sites, which is clearly consistent with the complete replacement of A with Z in the phage genome (Figure 22). ). The inventors further confirmed that PurZ-containing phages are widely distributed on Earth (data not shown).
表1本申请中筛选并表征的推定PurZ(斜体加粗表示测试过的种类)。Table 1 Putative PurZ screened and characterized in this application (bold in italics indicates species tested).
Figure PCTCN2022071726-appb-000014
Figure PCTCN2022071726-appb-000014
″N/A″:不适用"N/A": Not applicable
Figure PCTCN2022071726-appb-000015
Figure PCTCN2022071726-appb-000015
Figure PCTCN2022071726-appb-000016
Figure PCTCN2022071726-appb-000016
Figure PCTCN2022071726-appb-000017
Figure PCTCN2022071726-appb-000017
表3 PurZ的动力学参数。Table 3 Kinetic parameters of PurZ.
Figure PCTCN2022071726-appb-000018
Figure PCTCN2022071726-appb-000018
Figure PCTCN2022071726-appb-000019
Figure PCTCN2022071726-appb-000019
Figure PCTCN2022071726-appb-000020
Figure PCTCN2022071726-appb-000020
实施例2.PurZ0的鉴定、表征和测试Example 2. Identification, Characterization and Testing of PurZO
以实施例1鉴定的2-氨基脱氧腺苷酸琥珀酸酯(ADAS)合成酶为基础,发明人在已知噬菌体基因数据库中按照序列相似度和进化树分析得到了更多的噬菌体的2-氨基脱氧腺苷酸琥珀酸酯(ADAS)合成酶家族成员,包括SEQ ID NO:9-69和92-146。经过比分析,它们与实施例1鉴定并测试的PurZ实例具有相符的催化基序和重要残基,预期具有同样功能。此外,通过进一步的结构分析,发明人发现部分的ADAS合成酶家族成员是以GTP/dGTP以及dGMP和Asp作为反应底物,这与实施例1中的示例性PurZ以ATP/dATP以及dGMP和Asp作为反应底物是有所不同的,但是总体的反应路线是非常接近的。为了加以区别,以ATP/dATP作为底物的2-氨基脱氧腺苷酸琥珀酸酯(ADAS)合成酶称为PurZ,以GTP/dGTP作为底物的2-氨基脱氧腺苷酸琥珀酸酯(ADAS)合成酶称为PurZ0。Based on the 2-aminodeoxyadenosuccinate (ADAS) synthetase identified in Example 1, the inventors obtained more phage 2- A member of the aminodeoxyadenosuccinate (ADAS) synthetase family, including SEQ ID NOs: 9-69 and 92-146. After comparative analysis, they have consistent catalytic motifs and important residues with the PurZ examples identified and tested in Example 1, and are expected to have the same function. In addition, through further structural analysis, the inventors found that part of the ADAS synthase family members use GTP/dGTP as well as dGMP and Asp as reaction substrates, which is consistent with the exemplary PurZ in Example 1 with ATP/dATP and dGMP and Asp As reaction substrates are different, but the overall reaction route is very similar. To differentiate, the 2-aminodeoxyadenosuccinate (ADAS) synthase that uses ATP/dATP as substrate is called PurZ, and the 2-aminodeoxyadenosuccinate (ADAS) synthase that uses GTP/dGTP as substrate ( ADAS) synthase is called PurZ0.
本实施例以GpPurZ0(物种:Gordonia phage Archimedes(戈登氏菌噬菌体Archimedes),Uniprot ID:A0A7L7SI10,SEQ ID NO:121))为例对PurZ0进行了表征和测试。In this example, GpPurZ0 (species: Gordonia phage Archimedes (Gordonia phage Archimedes), Uniprot ID: A0A7L7SI10, SEQ ID NO: 121)) was used as an example to characterize and test PurZ0.
底物特异性测试以及与PurB联合作用测试的方法与实施例1类似(除了酶和可变底物替换之外,其他条件类似),结果如图23和24所示,确认了GpPurZ0以GTP/dGTP作为底物的底物特异性以及GpPurZ0与EcPurB活性的高效液相质谱联用检测结果。The method of the substrate specificity test and the combination test with PurB is similar to that of Example 1 (except for the replacement of enzymes and variable substrates, other conditions are similar), and the results are shown in Figures 23 and 24, confirming that GpPurZ0 has a GTP/ The substrate specificity of dGTP as a substrate and the detection results of GpPurZ0 and EcPurB activity by HPLC-MS.
GpPurZ0晶体结构解析方法如下:The crystal structure analysis method of GpPurZ0 is as follows:
对于GpPurZ0蛋白质的结晶,从TALON柱中得到的蛋白在2L缓冲液[20mM Tris·HCl,pH 7.5,5mM BME]中透析3个小时,然后用10mL Q Sepharose阴离子交换柱。用含有300至700mM KCl的缓冲液线性梯度洗脱。收集含有GpPurZ0的显着峰的蛋白并使用离心浓缩器(30K MWCO;Millipore)浓缩至最终体积为5mL。然后将该蛋白质溶液注入用缓冲液[20mM Tris·HCl,pH 7.5,0.2M KCl,1mM二硫苏糖醇]预平衡的Superdex200分子筛柱(300mL)中,并用缓冲液B洗脱。将凝胶过滤柱重新浓缩并用储存缓冲液[10mM HEPES/KOH,pH 7.4,50mM KCl,1mM Tris-(2-羧乙基)-膦盐酸盐]进行缓冲液交换。蛋白的终浓度调整为10mg/mL。GpPurZ0晶体的初步筛选是用结晶机器人系统(Gryphon,Art Robbins)在96孔板里通过坐滴法进行的。产生晶体的最佳条件是1.26M硫酸铵和0.2M硫酸锂、0.1M Tris pH 8.5和5mM GTP。使用结晶试剂加25%甘油作为冷冻保护剂,并在液氮中快速冷却晶体。在SSRF 18U(上海同步辐射光源)收集了分辨率为
Figure PCTCN2022071726-appb-000021
的衍射数据。使用HKL3000软件对数据进行处理。用网站PHYRE2创建的晶体结构模型在PHENIX软件上进行分子置换。该结构是根据电子云走向使用Coot软件手动构建的,并在PHENIX软件中进一步优化,然后将其上传到RCSB蛋白质数据库(登录代码7VF6)。附录表包含数据收集和最终优化的晶体结构数据(表7)。所有结构图均使用UCSF Chimera生成(图25)。
For the crystallization of GpPurZ0 protein, the protein obtained from the TALON column was dialyzed in 2 L buffer [20 mM Tris·HCl, pH 7.5, 5 mM BME] for 3 hours, and then used a 10 mL Q Sepharose anion exchange column. Elute with a linear gradient of buffer containing 300 to 700 mM KCl. The protein containing the significant peak of GpPurZ0 was collected and concentrated to a final volume of 5 mL using a centrifugal concentrator (30K MWCO; Millipore). The protein solution was then injected into a Superdex 200 molecular sieve column (300 mL) pre-equilibrated with buffer [20 mM Tris·HCl, pH 7.5, 0.2 M KCl, 1 mM dithiothreitol] and eluted with buffer B. The gel filtration column was re-concentrated and buffer exchanged with storage buffer [10 mM HEPES/KOH, pH 7.4, 50 mM KCl, 1 mM Tris-(2-carboxyethyl)-phosphine hydrochloride]. The final concentration of protein was adjusted to 10 mg/mL. Preliminary screening of GpPurZ0 crystals was performed by the sitting drop method in a 96-well plate using a crystallization robotic system (Gryphon, Art Robbins). The optimal conditions to produce crystals were 1.26M ammonium sulfate and 0.2M lithium sulfate, 0.1M Tris pH 8.5 and 5mM GTP. Use a crystallization reagent plus 25% glycerol as a cryoprotectant and rapidly cool the crystals in liquid nitrogen. collected at SSRF 18U (Shanghai Synchrotron Radiation Light Source) with a resolution of
Figure PCTCN2022071726-appb-000021
diffraction data. Data were processed using HKL3000 software. Molecular replacement was performed on PHENIX software using the crystal structure model created with the website PHYRE2. The structure was manually constructed using Coot software according to the electron cloud orientation, and further optimized in PHENIX software, which was then uploaded to the RCSB protein database (accession code 7VF6). The appendix table contains data collection and final optimized crystal structure data (Table 7). All structural maps were generated using UCSF Chimera (Figure 25).
表7.GpPurZ0晶体的数据收集和优化数据统计。Table 7. Data collection and optimization data statistics for GpPurZ0 crystals.
Figure PCTCN2022071726-appb-000022
Figure PCTCN2022071726-appb-000022
Figure PCTCN2022071726-appb-000023
Figure PCTCN2022071726-appb-000023
根据序列解析,PurZ和PurZ0相比,它们在dGMP活性位点是相同的,均为GSxxKG(其中x代表任意氨基酸残基,通常位于氨基酸序列的13-18位);但是在ATP/dATP与GTP/dGTP活性位点之间是不同的,具体表现在,PurZ的ATP/dATP活性位点为NxxN/Q(其中x代表任意氨基酸残基,通常位于氨基酸序列的300位前后),PurZ0的GTP/dGTP活性位点为T/SxxD(其中x代表任意氨基酸残基,通常也位于氨基酸序列的300位前后),根据以上结构特性能够辨别PurZ和PurZ0。According to the sequence analysis, PurZ and PurZ0 are the same in the dGMP active site, both are GSxxKG (where x represents any amino acid residue, usually located at the 13-18 position of the amino acid sequence); but in ATP/dATP and GTP /dGTP active sites are different, specifically, the ATP/dATP active site of PurZ is NxxN/Q (where x represents any amino acid residue, usually located around position 300 in the amino acid sequence), PurZ0 GTP/ The active site of dGTP is T/SxxD (where x represents any amino acid residue, usually located before and after the 300th position of the amino acid sequence), and PurZ and PurZ0 can be distinguished according to the above structural properties.
实施例3.酵母应用测试Example 3. Yeast application test
用同源重组的方法将ApPurZ,dATPase两个蛋白(参见实施例1)的基因序列重组克隆到质粒pRS426(尿嘧啶缺陷型)上,同样将DUF550(参见实施例1)的基因序列重组克隆到质粒pRS425(亮氨酸缺陷型)上(图26)。基因测序正确后将两个质粒转化到酵母中,涂在尿嘧啶和亮氨酸缺陷的酵母合成固体培养基上。PCR确定带有目的基因的质粒成功转化后,从中挑取单菌落,30℃培养,至OD600达到0.8,加入半乳糖诱导剂诱导ApPurZ,dATPase,DUF550蛋白过量表达。18h后(OD=3.5)高速离心并收集细胞,提取酵母基因组。并将提取的酵母基因组进行高效液相与质谱联用检测。被半乳糖诱导使上述三个蛋白过量表达的酵母基因组有dZ产生。通过dZ紫外色谱图的积分和结合消光系数定量计算,酵母基因组上dZ取代了20%的dA(图27)。如图27的结果所示,在半乳糖诱导的酵母细胞的基因组DNA中,(dZ+dA):dT约为0.85,dZ:(dZ+dA)约为0.2。The gene sequences of ApPurZ and dATPase (see Example 1) were recombinantly cloned into plasmid pRS426 (uracil-deficient) by the method of homologous recombination, and the gene sequence of DUF550 (see Example 1) was also cloned into the plasmid. on plasmid pRS425 (leucine deficient) (Figure 26). After the genes were sequenced correctly, the two plasmids were transformed into yeast and plated on yeast synthetic solid medium deficient in uracil and leucine. After confirming that the plasmid with the target gene was successfully transformed by PCR, a single colony was picked from it and cultured at 30°C until the OD600 reached 0.8. The galactose inducer was added to induce the overexpression of ApPurZ, dATPase and DUF550 proteins. After 18 hours (OD=3.5), the cells were collected by high-speed centrifugation, and the yeast genome was extracted. The extracted yeast genome was detected by high performance liquid chromatography coupled with mass spectrometry. Yeast genomes overexpressed by galactose-induced overexpression of the above three proteins produced dZ. By integration of the dZ UV chromatogram and quantitative calculation of the combined extinction coefficient, dZ replaced 20% of dA on the yeast genome (Figure 27). As shown in the results of FIG. 27 , in the genomic DNA of galactose-induced yeast cells, (dZ+dA):dT was about 0.85, and dZ:(dZ+dA) was about 0.2.
上文对本申请的各项发明的示例性实施方案进行了描述,但是,在不脱离本申请的实质和范围的情况下,本领域技术人员能够对本申请描述的示例性实施方案进行修改或改进,由此得到的变形方案或等同方案也属于本申请的范围。Exemplary embodiments of the various inventions of the present application have been described above, however, those skilled in the art can modify or improve the exemplary embodiments described in the present application without departing from the spirit and scope of the present application, The resulting variants or equivalents also fall within the scope of the present application.
参考文献references
1.M.D.Kirnos,I.Y.Khudyakov,N.I.Alexandrushkina,B.F.Vanyushin,2-aminoadenine is an adenine substituting for a base in S-2L cyanophage DNA.Nature 270,369-370(1977).1. M.D. Kirnos, I.Y. Khudyakov, N.I. Alexandrushkina, B.F. Vanyushin, 2-aminoadenine is an adenine substituting for a base in S-2L cyanophage DNA. Nature 270, 369-370 (1977).
2.I.Y.Khudyakov,M.D.Kirnos,N.I.Alexandrushkina,B.F.Vanyushin,Cyanophage S-2L contains DNA with 2,6-diaminopurine substituted for adenine.Virology 88,8-18(1978).2. I.Y.Khudyakov, M.D.Kirnos, N.I.Alexandrushkina, B.F.Vanyushin, Cyanophage S-2L contains DNA with 2,6-diaminopurine substituted for adenine. Virology 88, 8-18 (1978).
3.P.Weigele,E.A.Raleigh,Biosynthesis and function of modified bases in bacteria and their viruses.Chem.Rev.116,12655-12687(2016).3. P. Weigele, E.A. Raleigh, Biosynthesis and function of modified bases in bacteria and their viruses. Chem. Rev. 116, 12655-12687 (2016).
4.C.Cheong,I.Tinoco,Jr.,A.Chollet,Thermodynamic studies of base pairing involving 2,6-diaminopurine.Nucleic Acids Res.16,5115-5122(1988).4. C. Cheong, I. Tinoco, Jr., A. Chollet, Thermodynamic studies of base pairing involving 2, 6-diaminopurine. Nucleic Acids Res. 16, 5115-5122 (1988).
5.J.Sagi,E.Szakonyi,M.Voflickova,J.Kypr,Unusual contribution of 2-aminoadenine to the thermostability of DNA.J.Biomol.Struct.Dyn.13,1035-1041(1996).5. J. Sagi, E. Szakonyi, M. Voflickova, J. Kypr, Unusual contribution of 2-aminoadenine to the thermostability of DNA. J. Biomol. Struct. Dyn. 13, 1035-1041 (1996).
6.M.Cristofalo et al.,Nanomechanics of Diaminopurine-Substituted DNA.Biophys.J.116,760-771(2019).6. M. Cristofalo et al., Nanomechanics of Diaminopurine-Substituted DNA. Biophys. J. 116, 760-771 (2019).
7.M.P.Callahan et al.,Carbonaceous meteorites contain a wide range of extraterrestrial nucleobases.Proc.Natl.Acad.Sci.U.S.A.108,13995-13998(2011).7.M.P.Callahan et al., Carbonaceous meteorites contain a wide range of extraterrestrial nucleobases.Proc.Natl.Acad.Sci.U.S.A.108, 13995-13998 (2011).
8.P.Marliere,Patent WO2003093461.(2003).8. P. Marliere, Patent WO2003093461. (2003).
9.R.B.Honzatko,H.J.Fromm,Structure-function studies of adenylosuccinate synthetase from Escherichia coli.Arch.Biochem.Biophys.370,1-8(1999).9. R. B. Honzatko, H. J. Fromm, Structure-function studies of adenylosuccinate synthetase from Escherichia coli. Arch. Biochem. Biophys. 370, 1-8 (1999).
10.Z.Hou,M.Cashel,H.J.Fromm,R.B.Honzatko,Effectors of the stringent response target the active site of Escherichia coli adenylosuccinate synthetase.J.Biol.Chem.274,17505-17510(1999).10. Z. Hou, M. Cashel, H. J. Fromm, R. B. Honzatko, Effects of the stringent response target the active site of Escherichia coli adenylosuccinate synthetase. J. Biol. Chem. 274, 17505-17510 (1999).
11.C.Kang,N.Sun,R.B.Honzatko,H.J.Fromm,Replacement of Asp333 with Asn by site-directed mutagenesis changes the substrate specificity of Escherichia coli adenylosuccinate synthetase from guanosine 5′-triphosphate to xanthosine 5′-triphosphate.J.Biol.Chem.269,24046-24049(1994).11. C. Kang, N. Sun, R. B. Honzatko, H. J. Fromm, Replacement of Asp333 with Asn by site-directed mutagenesis changes the substrate specificity of Escherichia coli adenylosuccinate synthetase from guanosine 5′-triphosphate to xanthosine 5′-triphosphate. J. Biol. Chem. 269, 24046-24049 (1994).
12.M.J.Cavaluzzi,P.N.Borer,Revised UV extinction coefficients for nucleoside-5′-monophosphates and unpaired DNA and RNA.Nucleic Acids Res.32,e13(2004).12. M.J. Cavaluzzi, P.N. Borer, Revised UV extinction coefficients for nucleoside-5′-monophosphates and unpaired DNA and RNA. Nucleic Acids Res. 32, e13 (2004).
13.G.Hua et al.,Characterization of santalene synthases using an inorganic pyrophosphatase coupled colorimetric assay.Anal.Biochem.547,26-36(2018).13. G. Hua et al., Characterization of santalene synthases using an inorganic pyrophosphatase coupled colorimetric assay. Anal. Biochem. 547, 26-36 (2018).
14.B.D.Bennett et al.,Absolute metabolite concentrations and implied enzyme active site occupancy in Escherichia coli.Nat.Chem.Biol.5,593-599(2009).14. B.D. Bennett et al., Absolute metabolite concentrations and implied enzyme active site occupancy in Escherichia coli. Nat. Chem. Biol. 5, 593-599 (2009).
15.B.R.Bochner,B.N.Ames,Complete analysis of cellular nucleotides by two-dimensional thin layer chromatography.J.Biol.Chem.257,9759-9769(1982).15. B.R. Bochner, B.N. Ames, Complete analysis of cellular nucleotides by two-dimensional thin layer chromatography. J. Biol. Chem. 257, 9759-9769 (1982).
16.B.L.Greene et al.,Ribonucleotide Reductases:Structure,Chemistry,and Metabolism Suggest New Therapeutic Targets.Annu.Rev.Biochem.89,45-75(2020).16. B.L. Greene et al., Ribonucleotide Reductases: Structure, Chemistry, and Metabolism Suggest New Therapeutic Targets. Annu. Rev. Biochem. 89, 45-75 (2020).
17.Y.Hua et al.,Characterization and whole genome analysis of a novel bacteriophage SH-Ab 15497 against multidrug resistant Acinetobacater baummanii.Acta Biochim.Biophys.Sin.(Shanghai)51,1079-1081(2019).17. Y.Hua et al., Characterization and whole genome analysis of a novel bacteriophage SH-Ab 15497 again multidrug resistant Acinetobacater baummanii.Acta Biochim.Biophys.Sin.(Shanghai)51, 1079-1081(2019).
18.J.J.Kasianowicz,E.Brandin,D.Branton,D.W.Deamer,Characterization of individual polynucleotide molecules using a membrane channel.Proc.Natl.Acad.Sci.U.S.A.93,13770-13773(1996).18. J.J.Kasianowicz, E.Brandin, D.Branton, D.W.Deamer, Characterization of individual polynucleotide molecules using a membrane channel.Proc.Natl.Acad.Sci.U.S.A.93, 13770-13773 (1996).
19.L.Xu,M.Seki,Recent advances in the detection of base modifications using the Nanopore sequencer.J.Hum.Genet.65,25-33(2020).19. L. Xu, M. Seki, Recent advancements in the detection of base modifications using the Nanopore sequencer. J. Hum. Genet. 65, 25-33 (2020).
20.R.Han,Y.Li,X.Gao,S.Wang,An accurate and rapid continuous wavelet dynamic time warping algorithm for end-to-end mapping in ultra-long nanopore sequencing.Bioinformatics 34,i722-i731(2018).20. R. Han, Y. Li, X. Gao, S. Wang, An accurate and rapid continuous wavelet dynamic time warping algorithm for end-to-end mapping in ultra-long nanopore sequencing. Bioinformatics 34, i722-i731(2018 ).
21.M.Szekeres,A.V.Matveyev,Cleavage and sequence recognition of 2,6-diaminopurine-containing DNA by site-specific endonucleases.FEBS Lett.222,89-94(1987).21. M. Szekeres, A. V. Matveyev, Cleavage and sequence recognition of 2, 6-diaminopurine-containing DNA by site-specific endonucleases. FEBS Lett. 222, 89-94 (1987).
22.A.Chollet,E.Kawashima,DNA containing the base analogue 2-aminoadenine:preparation,use as hybridization probes and cleavage by restriction endonucleases.Nucleic Acids Res.16,305-317(1988).22. A. Chollet, E. Kawashima, DNA containing the base analogue 2-aminoadenine: preparation, use as hybridization probes and cleavage by restriction endonucleases. Nucleic Acids Res. 16, 305-317 (1988).
23.P.W.Rothemund,Folding DNA to create nanoscale shapes and patterns.Nature 440,297-302(2006).23. P.W. Rothemund, Folding DNA to create nanoscale shapes and patterns. Nature 440, 297-302 (2006).
24.L.Ceze,J.Nivala,K.Strauss,Molecular digital data storage using DNA.Nat.Rev.Genet.20,456-466(2019).24. L. Ceze, J. Nivala, K. Strauss, Molecular digital data storage using DNA. Nat. Rev. Genet. 20, 456-466 (2019).
25.R.T.Schooley et al.,Development and use of personalized bacteriophage-based therapeutic cocktails to treat a patient with a disseminated resistant Acinetobacter baumannii infection.Antimicrob.Agents Chemother.61,e00954-17(2017).25. R.T.Schooley et al., Development and use of personalized bacteriophage-based therapeutic cocktails to treat a patient with a disseminated resistant Acinetobacter baumannii infection. Antimicrob.Agents Chemother.61, e00954-17(2017).
26.R.M.Dedrick et al.,Engineered bacteriophages for treatment of a patient with a disseminated drug-resistant Mycobacterium abscessus.Nat.Med.25,730-733(2019)26. R.M. Dedrick et al., Engineered bacteriophages for treatment of a patient with a disseminated drug-resistant Mycobacterium abscessus. Nat. Med. 25, 730-733 (2019)
27.Z.D.Moye,J.Woolston,A.Sulakvelidze,Bacteriophage Applications for Food Production and Processing.Viruses 10,205(2018).27. Z. D. Moye, J. Woolston, A. Sulakvelidze, Bacteriophage Applications for Food Production and Processing. Viruses 10, 205 (2018).
28.Y.Zhang et al.,A semi-synthetic organism that stores and retrieves increased genetic information.Nature 551,644-647(2017).28. Y. Zhang et al., A semi-synthetic organism that stores and retrieves increased genetic information. Nature 551, 644-647 (2017).
29.S.Hoshika et al.,Hachimoji DNA and RNA:A genetic system with eight building blocks.Science 363,884-887(2019).29. S. Hoshika et al., Hachimoji DNA and RNA: A genetic system with eight building blocks. Science 363, 884-887 (2019).
30.D.A.Malyshev et al.,A semi-synthetic organism with an expanded genetic alphabet.Nature 509,385-388(2014).30. D.A. Malyshev et al., A semi-synthetic organism with an expanded genetic alphabet. Nature 509, 385-388 (2014).
31.J.Gollihar,M.Levy,A.D.Ellington,Many paths to the origin of life.Science 343,259-260(2014).31. J. Gollihar, M. Levy, A. D. Ellington, Many paths to the origin of life. Science 343, 259-260 (2014).
32.L.Zimmermann et al.,A completely reimplemented MPI bioinformatics toolkit with a new HHpred server at its core.J.Mol.Biol.430,2237-2243(2018).32. L. Zimmermann et al., A completely reimplemented MPI bioinformatics toolkit with a new HHpred server at its core. J. Mol. Biol. 430, 2237-2243 (2018).
33.C.V.Iancu,T.Borza,H.J.Fromm,R.B.Honzatko,Feedback inhibition and product complexes of recombinant mouse muscle adenylosuccinate synthetase.J.Biol.Chem.277,40536-40543(2002).33. C. V. Iancu, T. Borza, H. J. Fromm, R. B. Honzatko, Feedback inhibition and product complexes of recombinant mouse muscle adenylosuccinate synthetase. J. Biol. Chem. 277, 40536-40543 (2002).
34.J.Bridwell-Rabb,G.Kang,A.Zhong,H.W.Liu,C.L.Drennan,An HD domain phosphohydrolase active site tailored for oxetanocin-A biosynthesis.Proc.Natl.Acad.Sci.U.S.A.113,13750-13755(2016).34.J.Bridwell-Rabb, G.Kang, A.Zhong, H.W.Liu, C.L.Drennan, An HD domain phosphohydrolase active site tailored for oxetanocin-A biosynthesis.Proc.Natl.Acad.Sci.U.S.A.113, 13750-13755 ( 2016).
35.M.Kanehisa,S.Goto,S.Kawashima,A.Nakaya,The KEGG databases at GenomeNet.Nucleic Acids Res.30,42-46(2002).35. M. Kanehisa, S. Goto, S. Kawashima, A. Nakaya, The KEGG databases at GenomeNet. Nucleic Acids Res. 30, 42-46 (2002).
36.J.A.Gerlt et al.,Enzyme function initiative-enzyme similarity tool(EFI-EST):A web tool for generating protein sequence similarity networks.Biochim.Biophys.Acta.1854,1019-1037(2015).36. J.A. Gerlt et al., Enzyme function initiative-enzyme similarity tool (EFI-EST): A web tool for generating protein sequence similarity networks. Biochim. Biophys. Acta. 1854, 1019-1037 (2015).
37.P.Shannon et al.,Cytoscape:a software environment for integrated models of biomolecular interaction networks.Genome Res.13,2498-2504(2003).37. P. Shannon et al., Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498-2504 (2003).
38.K.Katoh,D.M.Standley,MAFFT multiple sequence alignment software version 7:improvements in performance and usability.Mol.Biol.Evol.30,772-780(2013).38. K. Katoh, D. M. Standley, MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772-780 (2013).
39.G.E.Crooks,G.Hon,J.M.Chandonia,S.E.Brenner,WebLogo:a sequence logo generator.Genome Res.14,1188-1190(2004).39. G. E. Crooks, G. Hon, J. M. Chandonia, S. E. Brenner, WebLogo: a sequence logo generator. Genome Res. 14, 1188-1190 (2004).
40.J.A.Gerlt,Genomic Enzymology:Web tools for leveraging protein family sequence-function space and genome context to discover novel functions.Biochemistry 56,4293-4308(2017).40. J.A. Gerlt, Genomic Enzymology: Web tools for leveraging protein family sequence-function space and genome context to discover novel functions. Biochemistry 56, 4293-4308 (2017).
41.S.Mehrotra,H.Balaram,Kinetic characterization of adenylosuccinate synthetase from the thermophilic archaea Methanocaldococcus jannaschii.Biochemistry 46,12821-12832(2007).41. S. Mehrotra, H. Balaram, Kinetic characterization of adenylosuccinate synthetase from the thermophilic archaea Methanocaldococcus jannaschii. Biochemistry 46, 12821-12832 (2007).
42.G.Kohn et al.,High inorganic triphosphatase activities in bacteria and mammalian cells:identification of the enzymes involved.PLoS One 7,e43879(2012).42. G. Kohn et al., High inorganic triphosphatase activities in bacteria and mammalian cells: identification of the enzymes involved. PLoS One 7, e43879 (2012).
43.M.Stoiber et al.,De novo identification of DNA modifications enabled by genome-guided nanopore signal processing.bioRxiv,094672v2(2017).43. M. Stoiber et al., De novo identification of DNA modifications enabled by genome-guided nanopore signal processing. bioRxiv, 094672v2 (2017).

Claims (16)

  1. 多肽,所述多肽具有2-氨基脱氧腺苷酸琥珀酸酯(ADAS)合成酶活性,能够以ATP、dATP、GTP、dGTP中的一种或多种以及dGMP和Asp为底物催化形成2-氨基脱氧腺苷酸琥珀酸酯,并且相比于腺苷酸琥珀酸酯合成酶(PurA)的GDxxKG催化基序,所述多肽的催化基序改变为GSxxKG,其中x代表任意氨基酸残基。Polypeptide, which has 2-aminodeoxyadenylate succinate (ADAS) synthetase activity, and can use one or more of ATP, dATP, GTP, dGTP and dGMP and Asp as substrates to catalyze the formation of 2- aminodeoxyadenosuccinate, and the catalytic motif of the polypeptide is changed to GSxxKG compared to the GDxxKG catalytic motif of adenylate succinate synthase (PurA), where x represents any amino acid residue.
  2. 如权利要求1所述的多肽,其中当所述多肽与来自大肠杆菌的腺苷酸琥珀酸酯合成酶(SEQ ID NO:72)的氨基酸序列比对时,对应于SEQ ID NO:72的303位由R改变为L。The polypeptide of claim 1, which corresponds to 303 of SEQ ID NO:72 when the polypeptide is aligned with the amino acid sequence of adenylate succinate synthase (SEQ ID NO:72) from Escherichia coli Bit changed from R to L.
  3. 如权利要求1或2所述的多肽,其中当所述多肽与SEQ ID NO:2所示的氨基酸序列比对时,具有对应于SEQ ID NO:2的274位的T、306位的N、307位的F和309位的N。The polypeptide of claim 1 or 2, wherein when the polypeptide is aligned with the amino acid sequence shown in SEQ ID NO: 2, it has T corresponding to position 274 of SEQ ID NO: 2, N at position 306, 307-bit F and 309-bit N.
  4. 如权利要求1-3中任一项所述的多肽,所述多肽包含SEQ ID NO:1-4、9-69、71和92-146中任一项所示的序列,或SEQ ID NO:1-4、9-69、71和92-146中任一项所示的序列具有一个或多个氨基酸的插入、缺失和/取代的保留2-氨基脱氧腺苷酸琥珀酸酯(ADAS)合成酶活性的变体,或SEQ ID NO:1-4、9-69、71和92-146中任一项所示的序列的保留催化基序的片段。The polypeptide of any one of claims 1-3, comprising the sequence set forth in any one of SEQ ID NOs: 1-4, 9-69, 71, and 92-146, or SEQ ID NO: Synthesis of 2-aminodeoxyadenoylate succinate (ADAS) retained with the sequence shown in any one of 1-4, 9-69, 71 and 92-146 having one or more amino acid insertions, deletions and/or substitutions Variants of enzymatic activity, or fragments of the sequences set forth in any of SEQ ID NOs: 1-4, 9-69, 71, and 92-146 that retain the catalytic motif.
  5. 多肽,所述多肽具有2’-脱氧腺嘌呤5’-三磷酸三磷酸水解酶(dATPase)活性,能够催化dATP水解生成2’-脱氧腺嘌呤(dA)并优选能催化dADP和dAMP水解生成2’-脱氧腺嘌呤(dA),所述多肽包含金属和配体结合口袋并优选包含Co 2+作为二价金属辅因子。 A polypeptide having 2'-deoxyadenine 5'-triphosphate triphosphate hydrolase (dATPase) activity, capable of catalyzing the hydrolysis of dATP to generate 2'-deoxyadenine (dA), and preferably capable of catalyzing the hydrolysis of dADP and dAMP to generate 2 '-deoxyadenine (dA), the polypeptide contains metal and ligand binding pockets and preferably Co 2+ as a divalent metal cofactor.
  6. 如权利要求5所述的多肽,所述多肽包含SEQ ID NO:5-7中任一项所示的序列,或SEQ ID NO:5-7和73-91中任一项所示的序列具有一个或多个氨基酸的插入、缺失和/取代的保留dATPase活性的变体,或SEQ ID NO:5-7和73-91中任一项所示的序列的保留催化基序的片段。The polypeptide of claim 5, comprising the sequence shown in any one of SEQ ID NOs: 5-7, or the sequence shown in any one of SEQ ID NOs: 5-7 and 73-91 having A variant of one or more amino acid insertions, deletions and/or substitutions that retains dATPase activity, or a fragment of the sequence set forth in any of SEQ ID NOs: 5-7 and 73-91 that retains the catalytic motif.
  7. 多肽,所述多肽具有dATP和dGTP焦磷酸水解酶活性,能够催化dATP水解成dAMP以及催化dGTP水解成dGMP,所述多肽包含金属和配体结合口袋并优选包含Co 2+作为二价金属辅因子。 Polypeptides having dATP and dGTP pyrophosphohydrolase activity capable of catalyzing the hydrolysis of dATP to dAMP and dGTP to dGMP, comprising metal and ligand binding pockets and preferably Co as a divalent metal cofactor .
  8. 如权利要求7所述的多肽,所述多肽包含SEQ ID NO:8所示的序列,或SEQ ID NO:8所示的序列具有一个或多个氨基酸的插入、缺失和/取代的保留dATP和dGTP焦磷酸水解酶活性的变体,或SEQ ID NO:8所示的序列的保留催化基序的片段。The polypeptide of claim 7 comprising the sequence set forth in SEQ ID NO: 8, or the sequence set forth in SEQ ID NO: 8 with one or more amino acid insertions, deletions and/or substitutions of retained dATP and A variant of dGTP pyrophosphohydrolase activity, or a fragment of the sequence set forth in SEQ ID NO:8 that retains the catalytic motif.
  9. 编码权利要求1-8中任一项所述的多肽的核酸分子。A nucleic acid molecule encoding the polypeptide of any one of claims 1-8.
  10. 包含权利要求9所述的核酸分子的载体。A vector comprising the nucleic acid molecule of claim 9.
  11. 改造噬菌体的方法,包括向噬菌体的基因组引入编码权利要求1-4中任一项所述 的多肽的核酸分子,以所述噬菌体表达权利要求1-4中任一项所述的多肽。A method for engineering a bacteriophage, comprising introducing a nucleic acid molecule encoding the polypeptide of any one of claims 1-4 into the genome of the bacteriophage, and expressing the polypeptide of any one of claims 1-4 with the bacteriophage.
  12. 如权利要求11所述的方法,还包括向所述噬菌体的基因组引入编码权利要求5-6中任一项所述的多肽的核酸分子,以所述噬菌体表达权利要求5-6中任一项所述的多肽;和/或向所述噬菌体的基因组引入编码权利要求7-8中任一项所述的多肽的核酸分子,以使所述噬菌体表达权利要求7-8中任一项所述的多肽。The method of claim 11, further comprising introducing into the genome of the bacteriophage a nucleic acid molecule encoding the polypeptide of any one of claims 5-6, to express any one of claims 5-6 with the bacteriophage and/or introducing into the genome of the bacteriophage a nucleic acid molecule encoding the polypeptide of any one of claims 7-8, so that the bacteriophage expresses any one of the claims 7-8 of polypeptides.
  13. 如权利要求11或12所述的方法,还包括向所述噬菌体的基因组引入编码腺苷酸琥珀酸裂解酶(PurB)的核酸分子,以使所述噬菌体表达腺苷酸琥珀酸裂解酶(PurB);和/或向所述噬菌体的基因组引入编码GMP激酶(GK)的核酸分子,以使所述噬菌体表达GMP激酶(GK)。The method of claim 11 or 12, further comprising introducing into the genome of the bacteriophage a nucleic acid molecule encoding adenylsuccinate lyase (PurB), so that the bacteriophage expresses adenylsuccinate lyase (PurB). ); and/or introducing a nucleic acid molecule encoding a GMP kinase (GK) into the genome of the bacteriophage, so that the bacteriophage expresses the GMP kinase (GK).
  14. 通过权利要求11-13中任一项所述的方法获得的噬菌体。Phage obtained by the method of any one of claims 11-13.
  15. 包含权利要求14所述的噬菌体的宿主细胞,例如细菌细胞。A host cell, such as a bacterial cell, comprising the phage of claim 14.
  16. 权利要求14所述的噬菌体或权利要求15所述的宿主细胞在二氨基嘌呤脱氧核糖核苷酸(dZTP)合成、DNA合成、DNA折纸、基于DNA的数据存储、抗菌药物制备、杀菌剂制备或防腐剂制备中的用途。The bacteriophage of claim 14 or the host cell of claim 15 in diaminopurine deoxyribonucleotide (dZTP) synthesis, DNA synthesis, DNA origami, DNA-based data storage, antibacterial drug preparation, bactericide preparation or Use in the preparation of preservatives.
PCT/CN2022/071726 2021-01-14 2022-01-13 Enzyme involved in phage diaminopurine synthesis, and use thereof WO2022152192A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110045505.X 2021-01-14
CN202110045505 2021-01-14

Publications (1)

Publication Number Publication Date
WO2022152192A1 true WO2022152192A1 (en) 2022-07-21

Family

ID=82446922

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/071726 WO2022152192A1 (en) 2021-01-14 2022-01-13 Enzyme involved in phage diaminopurine synthesis, and use thereof

Country Status (2)

Country Link
CN (2) CN114836399B (en)
WO (1) WO2022152192A1 (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060270005A1 (en) * 2002-04-30 2006-11-30 Institut Pasteur Genomic library of cyanophage s-2l and functional analysis

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10767175B2 (en) * 2016-06-08 2020-09-08 Agilent Technologies, Inc. High specificity genome editing using chemically modified guide RNAs
CN110607335B (en) * 2018-06-14 2021-08-03 中国科学院微生物研究所 Biosynthesis method of nicotinamide adenine dinucleotide compound

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060270005A1 (en) * 2002-04-30 2006-11-30 Institut Pasteur Genomic library of cyanophage s-2l and functional analysis

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
DATABASE PROTEIN 15 October 2021 (2021-10-15), ANONYMOUS : "adenylosuccinate synthetase [Vibrio phage phiVC8] ", XP055951692, retrieved from NCBI Database accession no. YP_009140156 *
DATABASE PROTEIN 19 August 2019 (2019-08-19), ANONYMOUS : "MAG: DUF550 domain-containing protein [Sinobacteraceae bacterium]", XP055951697, retrieved from NCBI Database accession no. TXG97711 *
DATABASE PROTEIN 19 August 2019 (2019-08-19), ANONYMOUS : "MAG: HD domain-containing protein [Sinobacteraceae bacterium]", XP055951696, retrieved from NCBI Database accession no. TXG97712 *
DATABASE PROTEIN 19 August 2019 (2019-08-19), ANONYMOUS : "MAG: hypothetical protein E6R08_06220 [Sinobacteraceae bacterium]", XP055951691, retrieved from NCBI Database accession no. TXG97710 *
HUA YUNFEN, XU MENGSHA, WANG RUI, ZHANG YIYUAN, ZHU ZHAOQIN, GUO MINQUAN, HE PING: "Characterization and whole genome analysis of a novel bacteriophage SH-Ab 15497 against multidrug resistant <italic>Acinetobacater baummanii</italic>", ACTA BIOCHIMICA BIOPHYSICA SINICA, BLACKWELL PUBLISHING, INC., MALDEN, MA, US, vol. 51, no. 10, 1 September 2019 (2019-09-01), US , pages 1079 - 1081, XP055951731, ISSN: 1672-9145, DOI: 10.1093/abbs/gmz094 *
ZHOU YAN, XU XUEXIA, WEI YIFENG, CHENG YU, GUO YU, KHUDYAKOV IVAN, LIU FULI, HE PING, SONG ZHANGYUE, LI ZHI, GAO YAN, ANG EE LUI, : "A widespread pathway for substitution of adenine by diaminopurine in phage genomes", SCIENCE, AMERICAN ASSOCIATION FOR THE ADVANCEMENT OF SCIENCE, US, vol. 372, no. 6541, 30 April 2021 (2021-04-30), US , pages 512 - 516, XP055951729, ISSN: 0036-8075, DOI: 10.1126/science.abe4882 *

Also Published As

Publication number Publication date
CN114836399A (en) 2022-08-02
CN114836399B (en) 2024-07-05
CN114836400A (en) 2022-08-02

Similar Documents

Publication Publication Date Title
Kropp et al. Crystal structures of ternary complexes of archaeal B-family DNA polymerases
Shenoy et al. Structural and biochemical analysis of the Rv0805 cyclic nucleotide phosphodiesterase from Mycobacterium tuberculosis
US9926542B2 (en) Practical method for enzymatically synthesizing cyclic di-GMP
Brieba et al. A lysine residue in the fingers subdomain of T7 DNA polymerase modulates the miscoding potential of 8-oxo-7, 8-dihydroguanosine
Benarroch et al. Characterization of a trifunctional mimivirus mRNA capping enzyme and crystal structure of the RNA triphosphatase domain
Brito et al. Crystal structure of Archaeoglobus fulgidus CTP: inositol-1-phosphate cytidylyltransferase, a key enzyme for di-myo-inositol-phosphate synthesis in (hyper) thermophiles
WO2008034338A1 (en) S-adenosylmethionine synthetase mutants, the dnas encoding the same and uses of the mutants
Das et al. A directed approach to improving the solubility of Moloney murine leukemia virus reverse transcriptase
Sekulic et al. Elucidation of the active conformation of the APS-kinase domain of human PAPS synthetase 1
KR20050004855A (en) Novel polyphosphate: amp phosphotransferase
WO2022152192A1 (en) Enzyme involved in phage diaminopurine synthesis, and use thereof
Petrov et al. Genetic insertions and diversification of the PolB-type DNA polymerase (gp43) of T4-related phages
Hible et al. Unique GMP‐binding site in Mycobacterium tuberculosis guanosine monophosphate kinase
Grishin et al. Identification of conserved features of LAGLIDADG homing endonucleases
Dandanell et al. Escherichia coli purine nucleoside phosphorylase II, the product of the xapA gene
US20080311626A1 (en) Dna Polymerases Having Strand Displacement Activity
Zhang et al. Mycobacterium tuberculosis CRISPR/Cas system Csm1 holds clues to the evolutionary relationship between DNA polymerase and cyclase activity
Stefanska et al. Discovery and characterization of RecA protein of thermophilic bacterium Thermus thermophilus MAT72 phage Tt72 that increases specificity of a PCR-based DNA amplification
Banfield et al. Structure of HisF, a histidine biosynthetic protein from Pyrobaculum aerophilum
Vincenzetti et al. Modulation of human cytidine deaminase by specific aminoacids involved in the intersubunit interactions
Ahlqvist et al. Crystal structure of DNA polymerase I from Thermus phage G20c
Xie et al. Structure–activity relationship of a cold-adapted purine nucleoside phosphorylase by site-directed mutagenesis
CN114369587B (en) Vaccinia virus capping enzyme mutant, recombinant vector, recombinant engineering bacterium and application thereof
Kim et al. Broad nucleotide cofactor specificity of DNA ligase from the hyperthermophilic crenarchaeon Hyperthermus butylicus and its evolutionary significance
Raeder et al. Structure of uracil-DNA N-glycosylase (UNG) from Vibrio cholerae: mapping temperature adaptation through structural and mutational analysis

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22739069

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22739069

Country of ref document: EP

Kind code of ref document: A1