CN117043341A - Compositions for degrading trypsin or TMPRSS2 - Google Patents

Compositions for degrading trypsin or TMPRSS2 Download PDF

Info

Publication number
CN117043341A
CN117043341A CN202280010679.5A CN202280010679A CN117043341A CN 117043341 A CN117043341 A CN 117043341A CN 202280010679 A CN202280010679 A CN 202280010679A CN 117043341 A CN117043341 A CN 117043341A
Authority
CN
China
Prior art keywords
protein
thr
trypsin
gly
val
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202280010679.5A
Other languages
Chinese (zh)
Inventor
李优先
渡边荣一郎
川岛祐介
王主君
小原收
本田贤也
新幸二
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Keio School
RIKEN Institute of Physical and Chemical Research
Original Assignee
Keio School
RIKEN Institute of Physical and Chemical Research
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Keio School, RIKEN Institute of Physical and Chemical Research filed Critical Keio School
Priority claimed from PCT/JP2022/001494 external-priority patent/WO2022158434A1/en
Publication of CN117043341A publication Critical patent/CN117043341A/en
Pending legal-status Critical Current

Links

Abstract

A composition for degrading trypsin or TMPRSS2, the active ingredient of which comprises a bacterium having 00502 protein or a bacterium having a protein having 30% or more sequence identity with the amino acid sequence of 00502 protein and having trypsin binding ability, or a bacterium having 00509 protein or a bacterium having a protein having 30% or more sequence identity with the amino acid sequence of 00509 protein and having trypsin binding ability.

Description

Compositions for degrading trypsin or TMPRSS2
Technical Field
The present application relates to compositions for degrading trypsin or TMPRSS 2. More particularly, the present application relates to compositions for degrading trypsin or TMPRSS2, diagnostic agents for diseases caused by trypsin or TMPRSS2, and quasi-drugs for diseases caused by trypsin or TMPRSS 2. The present application claims priority based on U.S. provisional application No. US 63/138,798 at 2021, month 1, and U.S. provisional application No. US 63/229,077 at 2021, month 8, 4, the contents of which are incorporated herein by reference.
Background
The gastrointestinal tract is a unique organ that is constantly exposed to numerous food and microorganism-derived molecules. In addition to foreign substances, the gastrointestinal tract encounters host-derived molecules, such as digestive enzymes and the like. In the upper intestine, digestive enzymes play an important role in breaking down the large amount of nutrients ingested in the diet into smaller components for physical absorption. On the other hand, the large intestine mainly absorbs water and does not require digestive enzymes, but rather, the imbalance of digestive enzyme activity is related to changes in microbial flora composition, disruption of mucosal barriers, and development of inflammation.
Intestinal tissue has a variety of regulatory and protective mechanisms, such as the production of mucins and enzyme inactivating molecules, to maintain balance and barrier function. It is also known that intestinal microbiota also contributes significantly to maintaining a stable environment by reducing or modifying substances in cavities (for example, refer to non-patent document 1).
Prior art literature
Non-patent literature
Non-patent document 1: j.l. round and s.k. mazmanian, the gut microbiota shapes intestinal immune responses during health and disease, nature reviews immunology,9,313-323,2009.
Disclosure of Invention
Problems to be solved by the invention
However, the details of the regulation of the protein in the cavity by the microorganism are not completely clear. In particular, the properties of microorganisms related to the regulation of digestive enzymes remaining in the large intestine remain largely unknown. The object of the present invention is to elucidate the functions of microorganisms involved in the regulation of residual proteases in the large intestine and to provide a technique for controlling the protease activity.
Means for solving the problems
The present invention includes the following aspects.
[1] A composition for degrading trypsin or TMPRSS2, the composition comprising as active ingredients the following bacteria: a bacterium having 00502 protein or a bacterium having a protein which has 30% or more of sequence identity with the amino acid sequence of 00502 protein and has trypsin-binding ability; or alternatively
Bacteria having 00509 protein or bacteria having proteins having 30% or more sequence identity with the amino acid sequence of 00509 protein and having trypsin binding capacity.
[2] A composition for degrading trypsin or TMPRSS2, the composition comprising as active ingredients the following bacteria: a bacterium having 00502 protein or a bacterium having a protein which has 30% or more of sequence identity with the amino acid sequence of 00502 protein and has trypsin-binding ability; and
bacteria having 00509 protein or bacteria having proteins having 30% or more sequence identity with the amino acid sequence of 00509 protein and having trypsin binding capacity.
[3] A composition for degrading trypsin or TMPRSS2, the composition comprising as active ingredients the following bacteria: a bacterium having a gene comprising a base sequence shown in SEQ ID NO. 1 or a bacterium having a gene which has 30% or more sequence identity with a base sequence shown in SEQ ID NO. 1 and encodes a protein having trypsin binding ability; or a bacterium having a gene comprising a base sequence shown in SEQ ID NO. 2 or a bacterium having a gene which has 30% or more sequence identity with a base sequence shown in SEQ ID NO. 2 and encodes a protein having trypsin binding ability.
[4] A composition for degrading trypsin or TMPRSS2, the composition comprising as active ingredients the following bacteria: a bacterium having a gene comprising a base sequence shown in SEQ ID NO. 1 or a bacterium having a gene which has 30% or more sequence identity with a base sequence shown in SEQ ID NO. 1 and encodes a protein having trypsin binding ability; and a bacterium having a gene comprising a base sequence shown in SEQ ID NO. 2 or a bacterium having a gene which has 30% or more sequence identity with a base sequence shown in SEQ ID NO. 2 and encodes a protein having trypsin binding ability.
[5] The composition for degrading trypsin or TMPRSS2 according to any of claims 1-4, wherein the bacteria have a secretion system of type IX (T9 SS).
[6] The composition for degrading trypsin or TMPRSS2 of claim 5, wherein the T9SS comprises PorV protein, porU protein, porN protein, porM protein, porL protein, porK protein, or PorP protein.
[7] The composition for degrading trypsin or TMPRSS2 according to any of the claims 1-6, wherein the bacterium is a bacterium of Paraprevatella, prevoltella, prevoltelamabilia or Bactoidetes genus.
[8] The composition for degrading trypsin or TMPRSS2 according to claim 7, wherein the bacterium of the genus Paraprevatella is Paraprevotella clara and the bacterium of the genus Prevoltella is at least one bacterium selected from the group consisting of Prevoltella rara, prevotella rodentium and Prevoltella muris.
[9] The composition for degrading trypsin or TMPRSS2 as claimed in any of claims 1-8, wherein the bacterium is a bacterium having a 16S rRNA gene comprising a base sequence as shown in SEQ ID NO. 3 or SEQ ID NO. 4 or a bacterium having a 16S rRNA gene comprising a base sequence having 97% or more sequence identity with the base sequence as shown in SEQ ID NO. 3 or SEQ ID NO. 4.
[10] The composition for degrading trypsin or TMPRSS2 according to any of claims 1-6, wherein the bacterium is at least one bacterium selected from the group consisting of Paraprevatella sp.MSP 0303, paraprevatella sp.MSP 0335, prevotellamassilia timonensis, bactoides sp.MSP 0288, bactoides sp.MSP 0410, bactoides sp.MSP 0435 and Porphyromonas gingivalis.
[11] The composition for degrading trypsin or TMPRSS2 of any one of claims 1-10, wherein the bacteria is a living bacteria.
[12] The composition for degrading trypsin or TMPRSS2 of any one of claims 1-10, wherein the bacteria is dead bacteria.
[13] A composition for degrading trypsin or TMPRSS2, comprising as active ingredients the following proteins:
00502 protein or a protein having 30% or more sequence identity to the amino acid sequence of said 00502 protein and having trypsin binding capacity; or alternatively
00509 protein or a protein having 30% or more sequence identity to the amino acid sequence of said 00509 protein and having trypsin binding capacity.
[14] A composition for degrading trypsin or TMPRSS2, comprising as active ingredients the following proteins:
00502 protein or a protein having 30% or more sequence identity to the amino acid sequence of said 00502 protein and having trypsin binding capacity; and
00509 protein or a protein having 30% or more sequence identity to the amino acid sequence of said 00509 protein and having trypsin binding capacity.
[15] A composition for degrading trypsin or TMPRSS2 according to any one of claims 1-14 for use in treating a disease caused by trypsin or TMPRSS 2.
[16] The composition for degrading trypsin or TMPRSS2 of any one of claims 1-15, wherein the disease caused by trypsin or TMPRSS2 is inflammatory bowel disease (ulcerative colitis, crohn's disease), irritable bowel disease, infection, acute pancreatitis, or chronic pancreatitis.
[17] The composition for degrading trypsin or TMPRSS2 of claim 16, wherein the infection is a viral infection or a bacterial infection.
[18] The composition for degrading trypsin or TMPRSS2 of claim 16 or 17, wherein the inflammatory bowel disease, irritable bowel disease, or infection is a disease involving TMPRSS2 or IgA.
[19] A diagnostic agent for a disease caused by trypsin or TMPRSS2, the diagnostic agent comprising a specific binding substance for detecting: 00502 protein or a protein having 30% or more sequence identity to the amino acid sequence of 00502 protein and having trypsin binding capacity; or alternatively
00509 protein or a protein having 30% or more sequence identity to the amino acid sequence of 00509 protein and having trypsin binding capacity.
[20] A diagnostic agent for a disease caused by trypsin or TMPRSS2, the diagnostic agent comprising a primer set or probe for detecting:
A gene comprising a base sequence shown as SEQ ID NO. 1 or a gene having 30% or more sequence identity with a base sequence shown as SEQ ID NO. 1 and encoding a protein having trypsin binding ability; or alternatively
A gene comprising a base sequence shown as SEQ ID NO. 2 or a gene having 30% or more sequence identity with a base sequence shown as SEQ ID NO. 2 and encoding a protein having trypsin binding ability.
[21] A quasi-drug for diseases caused by trypsin or TMPRSS2, which contains bacteria having the following proteins as active ingredients: 00502 protein or a protein having 30% or more sequence identity to the amino acid sequence of 00502 protein and having trypsin binding capacity; or alternatively
00509 protein or a protein having 30% or more sequence identity to the amino acid sequence of 00509 protein and having trypsin binding capacity.
[22] A quasi drug for diseases caused by trypsin or TMPRSS2, which contains as active ingredients the following proteins:
00502 protein or a protein having 30% or more sequence identity to the amino acid sequence of said 00502 protein and having trypsin binding capacity; or alternatively
00509 protein or a protein having 30% or more sequence identity to the amino acid sequence of said 00509 protein and having trypsin binding capacity.
ADVANTAGEOUS EFFECTS OF INVENTION
The present invention can provide a technique for controlling protease activity.
Drawings
FIG. 1 is a graph showing the results of proteomic analysis in example 1;
FIG. 2 is a graph showing the results of trypsin activity detection of feces of Specific Pathogen Free (SPF) mice and sterile (GF) mice in example 1;
FIG. 3 is a photograph showing the result of Western Blot analysis of feces of SPF mice and GF mice in example 1, showing the result of anionic trypsin (PRSS 2);
FIG. 4 is a fluorescence micrograph showing the results of mucus (UEA 1) and anionic trypsin (PRSS 2) detected by immunostaining large intestine sections of SPF mice and GF mice in example 1;
FIG. 5 is a graph showing the results of trypsin activity measured at different locations of the small intestine and the large intestine of GF mice and SPF mice in example 1;
fig. 6 is a graph showing the measurement results of trypsin activity in GF mice feces to which a fecal sample collected from a healthy human donor was applied in example 2;
FIG. 7 is a graph showing the results of measurement of trypsin activity in the feces of each group of mice in example 2;
FIG. 8 is a photograph showing the result of trypsin degradation by Western Blot measurement in example 2;
FIG. 9 is a photograph showing the result of measuring degradation of human trypsin by Western Blot in example 2;
FIG. 10 is a photograph showing the result of measuring trypsin degradation ability of various bacteria by Western Blot in example 2;
FIG. 11 is a photograph showing the results of incubating recombinant mouse PRSS2 (rmPRSS 2) pretreated with a protease inhibitor in combination with P.clara (1C 4) and analyzing rmPRSS2 degradation using Western Blot in example 3;
FIG. 12 is a photograph showing the result of observing the binding of rmPRSS2 to the surface of bacteria by confocal microscopy in example 3;
FIG. 13 is a photograph showing the result of incubating P.clara (1C 4) pretreated with tunicamycin with rmPRSS2 and analyzing the degradation of rmPRSS2 by Western Blot in example 3;
FIG. 14 is a photograph showing the results of incubating P.clara (1C 4) treated or not with tunicamycin with rmPRSS2, respectively, and observing the binding of rmPRSS2 to the bacterial surface with a confocal microscope in example 3;
FIG. 15 is a schematic diagram showing the arrangement of the gene compositions of type IX secretion mechanism (T9 SS) in the genomes of P.clara (JCM 14859), P.xylanipila (JCM 14860) and P.gingivalis (ATCC 33277) of example 3;
FIG. 16 is a schematic diagram showing homologous recombination absent PorU expression in example 3;
FIG. 17 is a photograph showing the result of incubating mutation P.clara (JCM 14859) with rmPRSS2 and analyzing the degradation of rmPRSS2 by Western Blot in example 3;
FIG. 18 is a photograph showing the results of Western Blot analysis of rmPRSS2 degradation mediated by wild-type P.clara (JCM 14859) or a series of mutant P.clara (JCM 14859) in example 3;
FIG. 19 is a photograph P.clara showing the results of incubating P.clara (Δ00502) and P.clara (Δ00509) with rmPRSS2, respectively, and observing the binding of rmPRSS2 to the bacterial surface with a confocal microscope in example 3;
FIG. 20 is a photograph P.clara showing the results of incubating P.clara (Δ00502) and P.clara (Δ00509) with rmPRSS2, respectively, and analyzing the degradation of rmPRSS2 by Western Blot in example 3;
FIG. 21 is a schematic diagram showing the genomic sequence of Parapreviella strain analyzed in example 3;
FIG. 22 is a photograph showing the result of incubating a P.clara mutant obtained by deleting each of 00502 gene to 00509 gene with rmPRSS2 and analyzing the degradation of P.clalarmPRSS2 by Western Blot in example 3;
FIG. 23 is a graph showing the results of measuring the protease activity of recombinant 00502 protein or 00509 protein by cleaving FITC-labeled casein in example 4;
FIG. 24 is a photograph showing the result of incubating recombinant 00502 protein and 00509 protein in free form or recombinant 00502 protein and 00509 protein bound to microbeads with rmPRSS2 and analyzing the degradation of rmPRSS2 by Western Blot in example 4;
FIG. 25 is a photograph showing the results of incubating recombinant 00502 protein, recombinant 00509 protein and Bovine Serum Albumin (BSA) bound to microbeads with rmPRSS2, respectively, and observing the binding of rmPRSS2 and protein-bound microbeads with a confocal microscope in example 4;
FIG. 26 is a photograph showing the result of incubating the content of the cecum of GF mice with recombinant 00502 protein bound to a culture medium control (-) or microbeads and analyzing trypsin degradation by Western Blot in example 4;
FIG. 27 is a graph showing the trypsin degradation model mediated by Parapreviella bacteria proposed by the inventors;
FIG. 28 is a graph showing the results of DNA quantification of the P.clara strain in feces of GF mice inoculated with the P.clara strain (wild type (WT), Δ00502 or Δ00509) and 2-mix in example 5;
FIG. 29 is a graph showing the results of measurement of trypsin activity in feces of GF mice inoculated with the P.clara strain (wild type (WT), Δ00502 or Δ00509) and 2-mix in example 5;
FIG. 30 is a graph showing the results of measurement of trypsin activity in feces of GF mice inoculated with the P.clara strains (wild type (WT), Δ00502) and 34-mix in example 5;
FIG. 31 is a photograph showing the results of Western Blot analysis of the results of the Western Blot analysis of proteins in feces of GF mice inoculated with the P.clara strain (wild type (WT), Δ00502 or Δ00509) and 2-mix in example 5;
FIG. 32 is a photograph showing the results of Western Blot analysis of proteins in the feces of GF mice, the feces of GF mice inoculated with the wild-type P.clara strain and 2-mix, a mixed sample of the two, and a mixed sample of the two plus trypsin inhibitor (TCLK) in example 5;
Fig. 33 is a graph showing the change in body weight of each group of mice after c.rodenium infection in example 6;
fig. 34 the upper half of fig. 34 shows the large intestine images of groups of mice 7 days after c.rodenium infection in example 6; the lower half of fig. 34 is a representative micrograph showing hematoxylin-eosin staining of cecal tissue in example 6;
fig. 35 is a graph showing the cecal histological scores based on hematoxylin-eosin staining in example 6;
FIG. 36 is a photograph showing the results of Western Blot analysis of proteins in feces in example 6;
FIG. 37 is a photograph showing the result of evaluating fecal IgA aggregation effect by in vitro incubation of living C.rodentum and fecal fluid in example 6;
FIG. 38 is a schematic view showing the experimental schedule of example 7;
FIG. 39 is a graph showing the change in body weight of mice after C.rodentum infection in example 7;
fig. 40 is a graph showing CFU measurements of c.rodentum in cecum plaques in example 7;
FIG. 41 is a graph showing CFU measurements of C.rodentium in the lumen contents of example 7;
FIG. 42 is a photograph showing the results of Western Blot analysis of the proteins in the lumen contents of the cecum in example 7;
FIG. 43 is a schematic diagram showing an experiment of infection with Mouse Hepatitis Virus (MHV) in example 8;
FIG. 44 is a graph showing the survival curve of mice after MHV infection in example 8;
FIG. 45 is a graph showing the results of measuring MHV virus titers in the liver, brain and stool in example 8;
fig. 46 is a representative image showing hematoxylin-eosin staining results of liver tissue of each group of mice in example 8; FIG. 47 is a diagram showing homologues of genes related to trypsin degradation (00502 gene and 00509 gene) and species encoded thereby, which were retrieved on a computer in example 9;
FIG. 48 is a graph showing the locus structure of the gene involved in trypsin degradation and the sequence identity (%) with respect to the gene of the P.clara strain in each bacterial species of example 9;
fig. 49 is a graph showing the measurement results of trypsin activity in feces of a non-IBD control group (healthy subjects), ulcerative Colitis (UC) group and Crohn's Disease (CD) group in japanese cohort in example 9;
FIG. 50 is a graph showing Paraprevatela carrier rates in a patient population diagnosed with Ulcerative Colitis (UC) and Crohn's Disease (CD) in PRISM and HMP2 cohorts of example 9;
FIG. 51 is a photograph showing the results of measurement of trypsin degradation mediated by strain Prevotella rodentium, prevolella muris in example 9;
FIG. 52 is a photograph showing the results of measurement of trypsin degradation activity mediated by the Prevotella ara (MSP 0081) strain in example 9;
FIG. 53 is a schematic view showing an example of a fusion protein of 00502 protein or its homolog and/or 00509 protein or its homolog with an antibody constant region.
DETAILED DESCRIPTION OF EMBODIMENT (S) OF INVENTION
[ composition for degrading trypsin or TMPRSS2 ]
In one embodiment, the present invention provides a composition for degrading trypsin or TMPRSS2, the composition comprising as active ingredients the following bacteria: a bacterium having 00502 protein or a bacterium having a protein having 30% or more of sequence identity with the amino acid sequence of 00502 protein and having trypsin binding ability, or a bacterium having 00509 protein or a bacterium having a protein having 30% or more of sequence identity with the amino acid sequence of 00509 protein and having trypsin binding ability.
As described in the examples below, the inventors have shown that bacteria having 00502 protein or 00509 protein adsorb trypsin or TMPRSS2 on the surface of the bacteria, and degrade trypsin or TMPRSS2 by autolysis of trypsin or TMPRSS2.
Therefore, a bacterium containing 00502 protein or a composition containing a bacterium containing 00509 protein as an active ingredient can be used for degrading trypsin or TMPRSS2. Thus, the composition of this embodiment can also be said to be a degradation agent for trypsin or TMPRSS2.
Degradation of trypsin or TMPRSS2 may be useful in industrial or pharmaceutical applications, as described below.
The UniProtKB accession number of 00502 protein of strain Paraprevotella clara (YIT 11840) described below (catalog number: JCM 14859) is G5SNC9. In addition, the 00502 protein of Paraprevotella clara (YIT 11840) strain is encoded by the HMPREF41_ 00858 gene (UniProtKB), the NCBI accession number of the HMPREF9441_00858 gene being NZ_JH376591 REGION:complex (87340..91131). The amino acid sequence of 00502 protein of Paraprevotella clara (YIT 11840) strain is shown as SEQ ID NO. 5, and the cDNA base sequence of HMPREF9441_00858 gene is shown as SEQ ID NO. 1.
The amino acid sequence of 00502 protein of Paraprevotella clara (1C 4) strain is shown in SEQ ID NO. 6, and the cDNA base sequence of gene encoding 00502 protein of Paraprevotella clara (1C 4) strain is shown in SEQ ID NO. 7.
The amino acid sequence of 00502 protein of Paraprevotella xylaniphila (82A 6) strain is shown as SEQ ID NO. 8, and the cDNA base sequence of gene encoding Paraprevotella xylaniphila (82A 6) 00502 protein of strain is shown as SEQ ID NO. 9.
The amino acid sequence of 00502 protein of Paraprevotella xylaniphila (YIT 11841) strain (catalog number: JCM 14860) is shown in SEQ ID NO. 10, and the cDNA base sequence of the gene encoding 00502 protein of Paraprevotella xylaniphila (YIT 11841) strain is shown in SEQ ID NO. 11.
The amino acid sequence of 00502 protein of Prevolella ra (109) strain (catalog number: DSM 105141) is shown in SEQ ID NO. 12, and the cDNA base sequence of the gene encoding 00502 protein of Prevolella ra (109) strain is shown in SEQ ID NO. 13.
The amino acid sequence of 00502 protein of Prevotella rodentium (PJ 1A) strain (catalog number: DSM 105243) is shown in SEQ ID NO. 14, and the cDNA base sequence of the gene encoding 00502 protein of Prevotella rodentium (PJ 1A) strain is shown in SEQ ID NO. 15.
The amino acid sequence of 00502 protein of Prevolella Muris (PMUR) strain (catalog number: DSM 103722) is shown in SEQ ID NO. 16, and the cDNA base sequence of the gene encoding 00502 protein of Prevolella Muris (PMUR) strain is shown in SEQ ID NO. 17.
Paraprevotella clara (YIT 11840) strain (catalog number: JCM 14859) 00509 protein UniProtKB accession number G5SNC1. In addition, the Paraprevotella clara (YIT 11840) strain 00509 protein is encoded by the HMPREF9441_00850 gene (UniProtKB). The NCBI accession number of the HMPREF9441_00850 gene is NZ_JH376591REGION 73848.76931. The amino acid sequence of 00509 protein of Paraprevotella clara (YIT 11840) strain is shown as SEQ ID NO. 18, and the cDNA base sequence of HMPREF9441_00850 gene is shown as SEQ ID NO. 2.
The amino acid sequence of 00509 protein of Paraprevotella clara (1C 4) strain is shown as SEQ ID NO. 19, and the cDNA base sequence of gene encoding Paraprevotella clara (1C 4) 00509 protein of strain is shown as SEQ ID NO. 20.
The amino acid sequence of 00509 protein of Parapprevotella xvlaniphila (82A 6) strain is shown as SEQ ID NO. 21, and the cDNA base sequence of the gene encoding 00509 protein of Parapprevotella xvlaniphila (82A 6) strain is shown as SEQ ID NO. 22.
The amino acid sequence of 00509 protein of Paraprevotella xylaniphila (YIT 11841) strain (catalog number: JCM 14860) is shown in SEQ ID NO. 23, and the cDNA base sequence of the gene encoding 00509 protein of Paraprevotella xylaniphila (YIT 11841) strain is shown in SEQ ID NO. 24.
The amino acid sequence of 00509 protein of Prevolella ra (109) strain (catalog number: DSM 105141) is shown in SEQ ID NO. 25, and the cDNA base sequence of the gene encoding 00509 protein of Prevolella ra (109) strain is shown in SEQ ID NO. 26.
The amino acid sequence of 00509 protein of Prevotella rodentium (PJ 1A) strain (catalog number: DSM 105243) is shown in SEQ ID NO. 27, and the cDNA base sequence of the gene encoding 00509 protein of Prevotella rodentium (PJ 1A) strain is shown in SEQ ID NO. 28.
As discussed in the examples below, the composition for degrading trypsin or TMPRSS2 according to the present embodiment may include bacteria having 00502 protein homolog or bacteria having 00509 protein homolog as an active ingredient.
00502 protein homologues include proteins having 30%, 40%, 50%, 60%, 70%, 80%, 90% or more than 95% sequence identity with the amino acid sequence shown in SEQ ID NO. 5 and having trypsin binding capacity. Bacteria with 00502 protein homologues should have trypsin or TMPRSS2 binding capacity and further have trypsin or TMPRSS2 degrading activity.
00509 protein homologues include proteins having 30%, 40%, 50%, 60%, 70%, 80%, 90% or more than 95% sequence identity with the amino acid sequence shown in SEQ ID NO. 18 and having trypsin binding capacity. Bacteria with 00509 protein homolog should have trypsin or TMPRSS2 binding capacity and further have trypsin or TMPRSS2 degrading activity.
The "containing as an active ingredient" of the composition for degrading trypsin or TMPRSS2 according to the present embodiment means containing a sufficient amount of a bacterium having 00502 protein or 00502 protein homolog, or containing a sufficient amount of a bacterium having 00509 protein or 00509 protein homolog, to degrade trypsin or TMPRSS2. Alternatively, "containing as an active ingredient" means containing a bacterium having 00502 protein or a 00502 protein homolog, or containing a bacterium having 00509 protein or a 00509 protein homolog as a main active ingredient.
The composition for degrading trypsin or TMPRSS2 according to the present embodiment, the bacterium having 00502 protein or 00502 protein homolog refers to a bacterium expressing 00502 protein or 00502 protein homolog.
The bacterium expressing 00502 protein may be a bacterium having hmpref9441_00858 gene (SEQ ID NO: 1) encoding 00502 protein. The bacterium having the 00502 protein homolog may also be a bacterium having a gene which has 30%, 40%, 50%, 60%, 70%, 80%, 90% or 95% or more sequence identity with the base sequence shown in SEQ ID NO. 1 and has cDNA encoding a protein having trypsin binding ability. 00502 protein may be derived from an endogenous gene or an exogenous gene of the bacterium.
Likewise, a bacterium having 00509 protein or a 00509 protein homolog refers to a bacterium expressing 00509 protein or a 00509 protein homolog.
The bacterium expressing 00509 protein may be a bacterium having hmpref9441_00850 gene (SEQ ID NO: 2) encoding 00509 protein. The bacterium having the 00509 protein homolog may also be a bacterium having a gene which has 30%, 40%, 50%, 60%, 70%, 80%, 90% or 95% or more sequence identity with the base sequence shown in SEQ ID NO. 1 and has cDNA encoding a protein having trypsin binding ability. 00509 protein may be derived from an endogenous gene or an exogenous gene of the bacterium.
The composition for degrading trypsin or TMPRSS2 according to the present embodiment may contain only bacteria having 00502 protein or 00502 protein homolog, only bacteria having 00509 protein or 00509 protein homolog, or both bacteria having 00502 protein or 00502 protein homolog and bacteria having 00509 protein or 00509 protein homolog, as long as it has an activity of degrading trypsin or TMPRSS 2. Alternatively, the composition for degrading trypsin or TMPRSS2 according to the present embodiment may comprise a bacterium having 00502 protein or 00502 protein homolog and a bacterium having 00509 protein or 00509 protein homolog. Such bacteria may be prepared by genetic engineering or other means.
In the composition for degrading trypsin or TMPRSS2 according to the present embodiment, bacteria having 00502 protein or 00502 protein homolog, or bacteria having 00509 protein or 00509 protein homolog should have IX type secretion system (IX type secretion mechanism, T9 SS).
For example, T9SS contains PorV protein, porU protein, porN protein, porM protein, porL protein, porK protein, or PorP protein.
As discussed in the examples below, the inventors have demonstrated that 00502 protein or 00502 protein homolog, or 00509 protein homolog, is transported across the outer membrane to and bound to the bacterial surface by T9 SS.
For example, in the Paraprevotella clara strain described below, the NCBI of PorV protein is WP_008622445.1, the NCBI of PorU protein is WP_008622443.1, the NCBI of PorN protein is WP_008623210.1, the NCBI of PorM protein is WP_008623211.1, the NCBI of PorL protein is WP_008623213.1, the NCBI of PorK protein is WP_008623215.1, and the NCBI of PorP protein is WP_008623217.1. In bacteria other than the Paraprevotella clara strain, homologs of these proteins in the bacteria constitute T9SS.
The bacterium having 00502 protein or 00502 protein homolog, or the bacterium having 00509 protein or 00509 protein homolog in the composition for degrading trypsin or TMPRSS2 according to the present embodiment may be a bacterium belonging to the genus paraprevatella, prevotella or bacterioides.
As discussed in the examples below, the inventors have demonstrated that bacteria belonging to the genera Paraprefotella, prevoltellella, prevoltellamabilia or Bactoidetes can degrade trypsin or TMPRSS2. Bacteria of the genus Parapreviella include Paraprevotella clara, paraprevotella xylaniphila, parapreviella sp.MSP 0303, parapreviella sp.MSP 0335, and the like. Bacteria of the genus Prevolella include Prevolella rara, prevotella rodentium and Prevolella muris, etc.
The bacterium of Paraprevatella has a 16S rRNA gene composed of a base sequence shown as SEQ ID NO. 3 or SEQ ID NO. 4, or a 16S rRNA gene composed of a base sequence having 97% or more sequence identity with the base sequence shown as SEQ ID NO. 3 or SEQ ID NO. 4. It should be noted that these bacteria are considered to belong to the same species if the sequence identity of the 16S rRNA gene is 97% or more.
The base sequence shown in SEQ ID NO. 3 is the base sequence of the 16S rRNA gene of the Paraprevotella clara (YIT 11840) strain (catalog number: JCM 14859) described below. The base sequence shown in SEQ ID NO. 29 is the base sequence of the 16S rRNA gene of the Paraprevotella clara (1C 4) strain. The base sequence shown in SEQ ID NO. 4 is the base sequence of the 16S rRNA gene of the Paraprevotella xylaniphila (82A 6) strain described below. The base sequence shown in SEQ ID NO. 30 is the base sequence of the 16S rRNA gene of strain Paraprevotella xylaniphila (YIT 11841) (catalog number: JCM 14860).
The bacterium having 00502 protein or 00502 protein homolog, or bacterium having 00509 protein or 00509 protein homolog in the composition for degrading trypsin or TMPRSS2 according to the present embodiment may be at least one bacterium of paraprevatella sp.msp 0303, paraprevatella sp.msp 0335, prevotellamassilia timonensis, bactericides sp.msp 0288, bactericides sp.msp 0410, bactericides sp.msp 0435, and Porphyromonas gingivalis.
Prevolella ra includes the Prevolella ra (MSP 0081) strain and the Prevolella ra (109) strain. Prevotella rodentium includes the Prevotella rodentium (PJ 1A) strain (catalog number: DSM 105243). Prevolella muris included the Prevolella Muris (PMUR) strain (catalog number: DSM 103722). Prevotellamassilia timonensis includes Prevotellamassilia timonensis (MSP 0224) strain. Porphyromonas gingivalis includes the Porphyromonas gingivalis (ATCC 33277) strain.
As discussed in the examples below, the inventors have demonstrated that these bacteria can degrade trypsin or TMPRSS2.
The base sequence shown in SEQ ID NO. 31 is the base sequence of the 16S rRNA gene of the Prevolella ra (109) strain (catalog number: DSM 105141). The base sequence shown in SEQ ID NO. 32 is the base sequence of the 16S rRNA gene of the Prevotella rodentium (PJ 1A) strain (catalog number: DSM 105243). The base sequence shown in SEQ ID NO. 33 is the base sequence of the 16S rRNA gene of the Prevolella Muris (PMUR) strain (catalog number: DSM 103722).
In the composition for degrading trypsin or TMPRSS2 according to the present embodiment, the above bacteria may be living bacteria or dead bacteria as long as they have the degrading activity of trypsin or TMPRSS2.
The composition for degrading trypsin or TMPRSS2 according to the present embodiment may be in an aqueous form or a semi-solid form such as a solution or a suspension, may contain the above bacteria in a powder form or a lyophilized form, etc. In one embodiment, the composition or bacteria are lyophilized. In one embodiment, a subset of bacteria in the composition is lyophilized. Methods of lyophilizing a composition comprising bacteria are well known in the art. For example, reference may be made to the specification of U.S. Pat. No.3261761, the specification of U.S. Pat. No.4205132 and International publication No.2012/098358. These references are incorporated herein by reference.
The bacteria may be lyophilized as a composition or may be separately lyophilized and then combined. The bacteria may be combined with a pharmaceutically acceptable carrier before being combined with other bacteria. A variety of lyophilized bacteria can be combined in lyophilized form. The combined bacterial mixture may then be combined with a pharmaceutically acceptable carrier. In one embodiment, the bacteria are lyophilized solids. In one embodiment, the composition is a lyophilized solid.
In one embodiment, the present invention provides a composition for degrading trypsin or TMPRSS2, the composition comprising as active ingredients the following proteins: 00502 protein or a protein having 30% or more sequence identity with the amino acid sequence of said 00502 protein and having trypsin binding capacity, or 00509 protein or a protein having 30% or more sequence identity with the amino acid sequence of said 00509 protein and having trypsin binding capacity.
As discussed in the examples below, the inventors have demonstrated that 00502 protein or 00509 protein can degrade trypsin or TMPRSS2. Here, the 00502 protein or 00509 protein is preferably immobilized on a solid-phase surface.
The solid phase may be any material including particles made of resin, glass or metal, the surface of a container (e.g., a plate or tube, etc.), a film made of resin, etc. The particles may be magnetic particles.
The solid phase may be a pharmaceutically acceptable solid phase. Pharmaceutically acceptable solid phases include liposomes, polymer nanoparticles (e.g., protein nanoparticles, lipid nanoparticles, etc.), nanoemulsions (e.g., iron nanoparticles, lipid microspheres, etc.), micelles, vaccine adjuvants, nanocrystals, etc. The pharmaceutically acceptable solid phase is preferably approved by the national institutes of medicine and medical devices (Pharmaceutical and Medical Devices Agency, PMDA), the united states food and drug administration (Food and Drug Administration, FDA), the european medicines administration (European Medicines Agency, EMA).
The method of immobilizing 00502 or 00509 protein on a solid phase is not limited, and includes methods of binding using a chemical linker, binding using avidin-biotin, and physical adsorption.
As discussed in the examples below, the compositions for degrading trypsin or TMPRSS2 may comprise 00502 protein homolog or 00509 protein homolog as an active ingredient.
00502 protein homologues may have 30%, 40%, 50%, 60%, 70%, 80%, 90% or more than 95% sequence identity with the amino acid sequence shown in SEQ ID NO. 5, and have trypsin binding capacity, as described above. Preferably, the 00502 protein homologue has trypsin or TMPRSS2 binding ability.
00509 protein homologue may have 30%, 40%, 50%, 60%, 70%, 80%, 90% or more than 95% sequence identity with the amino acid sequence shown in SEQ ID NO. 18 and have trypsin binding capacity as described above. Preferably, the 00509 protein homologue has trypsin or TMPRSS2 binding ability.
In the composition for degrading trypsin or TMPRSS2 according to the present embodiment, "containing as an active ingredient" means containing a sufficient amount of 00502 protein or 00502 protein homolog, or containing a sufficient amount of 00509 protein or 00509 protein homolog, to degrade trypsin or TMPRSS2. Alternatively, "containing as an active ingredient" means containing 00502 protein or 00502 protein homolog, or containing 00509 protein or 00509 protein homolog as a main active ingredient.
00502 protein or 00502 protein homologue, or 00509 protein homologue, may be added with a peptide tag for protein purification, protein detection or binding to a solid phase. Peptide tags include, but are not limited to, histidine tags, FLAG tags, MYC tags, and the like.
The composition for degrading trypsin or TMPRSS2 according to the present embodiment may contain only 00502 protein or 00502 protein homolog, only 00509 protein or 00509 protein homolog, or both 00502 protein or 00502 protein homolog and 00509 protein or 00509 protein homolog, as long as it has an activity of degrading trypsin or TMPRSS 2.
When the composition for degrading trypsin or TMPRSS2 according to the present embodiment contains both 00502 protein or its homolog and 00509 protein or its homolog, 00502 protein or its homolog and 00509 protein or its homolog may be combined.
Here, 00502 protein or its homolog and 00509 protein or its homolog may be combined linearly or circularly. The 00502 protein or its homolog and 00509 protein or its homolog may be directly bound or bound via a linker. The linker is not limited to, for example, a peptide comprising an amino acid sequence in which GGGS (SEQ ID NO: 34) is repeated 1 to 4 times, or the like.
00502 protein or a homolog thereof and/or 00509 protein or a homolog thereof may be present in the form of a protein fused to the constant region of an antibody. The antibody constant region may be a constant region derived from a human antibody or a constant region derived from a human IgG type antibody.
FIG. 53 is a schematic view showing an example of a fusion protein of 00502 protein or its homolog and/or 00509 protein or its homolog with an antibody constant region. In FIG. 53, "protein" represents 00502 protein or its homolog, or 00509 protein or its homolog, "CH2" represents the CH2 domain of the antibody constant region, and "CH3" represents the CH3 domain of the antibody constant region.
As shown in FIG. 53, 00502 protein or its homolog and/or 00509 protein or its homolog and antibody constant region may be linked by a linker. The linker is not limited to, for example, a peptide comprising an amino acid sequence in which GGGS (SEQ ID NO: 34) is repeated 1 to 4 times, or the like.
[ pharmaceutical composition for treating diseases caused by trypsin or TMPRSS2 ]
In one embodiment, the above-described compositions for degrading trypsin or TMPRSS2 are useful for treating diseases caused by trypsin or TMPRSS 2. In other words, in one embodiment, the present invention provides a pharmaceutical composition for treating a disease caused by trypsin or TMPRSS 2. The pharmaceutical composition of the present embodiment contains the following components as active ingredients: a bacterium having 00502 protein or a homolog of 00502 protein, a bacterium having 00509 protein or a homolog of 00509 protein, a bacterium having 00502 protein or a homolog of 00502 protein, a homolog of 00502 protein or a homolog of 00502 protein, or a homolog of 00509 protein.
In the pharmaceutical composition of this embodiment, 00502 protein homolog, 00509 protein homolog, and bacteria having these proteins are the same as those described above.
In the pharmaceutical combination according to the present embodiment, diseases caused by trypsin or TMPRSS2 include inflammatory bowel disease (ulcerative colitis, crohn's disease), irritable bowel disease, acute pancreatitis and chronic pancreatitis.
In addition, in the pharmaceutical composition of the present embodiment, diseases caused by trypsin or TMPRSS2 include infection. Infections include viral or bacterial infections. Such inflammatory bowel disease, irritable bowel disease or infection includes diseases involving TMPRSS2 or IgA. Infections involving TMPRSS2 include coronavirus infections and the like, and infections involving IgA include Salmonella infections and the like.
The pharmaceutical composition according to this embodiment may be formulated with a pharmaceutically acceptable carrier. As pharmaceutically acceptable carriers, those commonly used in formulating pharmaceutical compositions may be used without limitation. More specifically, for example, binders (such as gelatin, corn starch, tragacanth, acacia, and the like), excipients (such as starch, crystalline cellulose and the like), and bulking agents (such as alginic acid and the like).
The pharmaceutical composition according to the present embodiment may contain additives. Additives include lubricants (e.g., calcium stearate, magnesium stearate, and the like), sweeteners (e.g., sucrose, lactose, saccharin, maltitol, and the like), flavoring agents (e.g., peppermint, red monomer oil, and the like), stabilizers (e.g., benzyl alcohol, phenol, and the like), and buffering agents (e.g., phosphate, sodium acetate, and the like).
The pharmaceutical composition according to the present embodiment may be formulated by mixing appropriately combining the above-mentioned carriers and additives in a unit dosage form conforming to the general needs.
The administration method of the pharmaceutical composition according to the present embodiment is not particularly limited, and may be determined according to the symptoms, weight, age, sex, and the like of the patient. Such as tablets, dispersions, capsules, liquid formulations, intravenous formulations, suppositories, and the like. Tablets, dispersions, capsules and liquid formulations are orally administered. Intravenous formulations and suppositories are enterally administered. The pharmaceutical composition is preferably in a form capable of delivering the active ingredient (a bacterium having 00502 protein or a homolog of 00502 protein, a bacterium having 00509 protein or a homolog of 00509 protein, a 00502 protein or a homolog of 00502 protein, or a homolog of 00509 protein) to the intestinal tract.
The dosage of the pharmaceutical composition depends on the symptoms, weight, age, sex, and other factors of the patient, and cannot be approximated. If the active ingredient is a living bacterium, it is considered that the active ingredient is administered 1 or more times in a unit dosage form of 0.1 to 100mg/kg body weight per administration. If the active ingredient is dead bacteria or protein, it is contemplated that the active ingredient may be administered in unit dosage form of 0.1-100mg/kg body weight 1 or more times per day.
The pharmaceutical composition according to the present embodiment may be a quasi drug. Quasi-drugs are products with specific effects and effects, and have light effects on human bodies. Quasi-drugs of this embodiment include, but are not limited to, beverages, gastric drugs, enteral formulations, and the like.
[ diagnostic agent for diseases caused by trypsin or TMPRSS2 ]
In one embodiment, the invention provides a diagnostic agent for a disease caused by trypsin or TMPRSS2 comprising a specific binding substance for detecting: 00502 protein or a protein having 30% or more sequence identity with the amino acid sequence of 00502 protein and having trypsin binding ability, or 00509 protein or a protein having 30% or more sequence identity with the amino acid sequence of 00509 protein and having trypsin binding ability.
In other words, the diagnostic agent of the present embodiment contains a specific binding substance for detecting 00502 protein, 00502 protein homolog, 00509 protein, and 00509 protein homolog at the protein level.
Specific binding substances include antibodies, antibody fragments, and aptamers, among others. Antibody fragments include Fab, F (ab ') 2, fab' and single chain antibodies (scFv), and the like. The antibody may be a monoclonal or polyclonal antibody. In addition, commercially available antibodies can be used.
In the diagnostic agent of the present embodiment, 00502 protein homolog, 00509 protein homolog, and bacteria having these proteins are the same as those described above. Diseases caused by trypsin or TMPRSS2 are also the same as previously described.
The diagnostic agents of this embodiment can be used to detect the presence of 00502 protein, 00502 protein homolog, 00509 protein homolog, or bacteria having these in a biological sample from a subject.
The detection principle of the diagnostic agent of the present embodiment is not limited, and for example, methods such as enzyme-linked immunosorbent assay (ELISA), lateral flow immunoassay, western Blot, and flow cytometry (FACS) can be used.
Biological samples derived from subjects include fecal samples and the like. If the presence of 00502 protein, 00502 protein homolog, 00509 protein homolog, or bacteria having these proteins is detected in a biological sample derived from a subject using the diagnostic agent of the present embodiment, it can be determined that the subject is not suffering from a disease caused by trypsin or TMPRSS 2.
If the presence of 00502 protein, 00502 protein homolog, 00509 protein homolog, or bacteria having these proteins is not detected in the biological sample from which the subject originated, it can be judged that the subject is likely to suffer from a disease caused by trypsin or TMPRSS 2. In this case, the symptoms of the disease caused by trypsin or TMPRSS2 can be treated or alleviated by administering the above pharmaceutical composition or quasi-drug to the subject.
In one embodiment, the diagnostic agent of this embodiment may be used to detect 00502 protein, 00502 protein homolog, 00509 protein, and 00509 protein homolog at the gene level.
In other words, in one embodiment, the present invention provides a diagnostic agent for a disease caused by trypsin or TMPRSS2 comprising primers or probes for detecting: the HMPREF41_ 00858 gene encoding 00502 protein or the gene having 30% or more sequence identity with the base sequence of the HMPREF9441_00858 gene and encoding a protein having trypsin binding ability, the HMPREF41_ 00850 gene encoding 00509 protein or the gene having 30% or more sequence identity with the base sequence of the HMPREF9441_00850 gene and encoding a protein having trypsin binding ability. In other words, the diagnostic agent of the present embodiment includes primers or probes for detecting the following genes: a gene comprising a base sequence shown in SEQ ID NO. 1 or a gene having 30% or more of sequence identity to a base sequence shown in SEQ ID NO. 1 and encoding a protein having trypsin binding ability, or a gene comprising a base sequence shown in SEQ ID NO. 2 or a gene having 30% or more of sequence identity to a base sequence shown in SEQ ID NO. 2 and encoding a protein having trypsin binding ability.
The HMPREF41_ 00858 and HMPREF9441_00850 genes are the same as those described above.
The gene which has 30% or more sequence identity with the hmpref9441_00858 gene sequence and which encodes a protein having trypsin binding capacity is a gene encoding a homolog of the 00502 protein described above.
The gene which has 30% or more sequence identity with the hmpref9441_00850 gene sequence and which encodes a protein having trypsin binding capacity is a gene encoding a homolog of the 00509 protein described above.
The diagnostic agent of the present embodiment can be used to detect the presence or absence of a bacterium having 00502 protein, a bacterium having 00502 protein homolog, a bacterium having 00509 protein, and a bacterium having 00509 protein homolog in a biological sample derived from a subject.
The detection principle of the diagnostic agent of the present embodiment is not limited, and for example, methods such as PCR, RNA sequencing (RNA-Seq), genomic analysis, and DNA microarray analysis can be used.
The biological sample derived from the subject is the same as the above sample, and for example, a fecal sample or the like can be used. If the presence of a bacterium having 00502 protein, a bacterium having 00502 protein homolog, a bacterium having 00509 protein, or a bacterium having 00509 protein homolog is detected in a biological sample derived from a subject using the diagnostic agent of the present embodiment, it can be judged that the subject has not suffered from a disease caused by trypsin or TMPRSS 2.
If the presence of a bacterium having 00502 protein, a bacterium having a homolog of 00502 protein, a bacterium having 00509 protein, or a bacterium having a homolog of 00509 protein is not detected in a biological sample derived from the subject, it can be judged that the subject is likely to suffer from a disease caused by trypsin or TMPRSS 2.
In this case, the disease symptoms caused by trypsin or TMPRSS2 may be treated or alleviated by administering the aforementioned pharmaceutical composition or quasi-drug to the subject.
Other embodiments
In one embodiment, the invention provides a method of treating a disease caused by trypsin or TMPRSS2 comprising administering to a patient in need of treatment an effective amount of bacteria: a bacterium having 00502 protein, a bacterium having a protein having 30% or more sequence identity with the amino acid sequence of 00502 protein and having trypsin binding ability, a bacterium having 00509 protein, or a bacterium having a protein having 30% or more sequence identity with the amino acid sequence of 00509 protein and having trypsin binding ability.
In one embodiment, the invention provides a method of treating a disease caused by trypsin or TMPRSS2, the method comprising administering to a patient in need of treatment an effective amount of bacteria: a bacterium having a gene comprising a base sequence shown in SEQ ID NO. 1, a bacterium having a gene which has 30% or more of sequence identity to a base sequence shown in SEQ ID NO. 1 and encodes a protein having trypsin binding ability, a bacterium having a gene comprising a base sequence shown in SEQ ID NO. 2, or a bacterium having a gene which has 30% or more of sequence identity to a base sequence shown in SEQ ID NO. 2 and encodes a protein having trypsin binding ability.
In one embodiment, the invention provides a method of treating a disease caused by trypsin or TMPRSS2, the method comprising administering to a patient in need of treatment an effective amount of a protein of: 00502 protein, a protein having 30% or more sequence identity with the amino acid sequence of 00502 protein and having trypsin binding capacity, 00509 protein, or a protein having 30% or more sequence identity with the amino acid sequence of 00509 protein and having trypsin binding capacity.
In one embodiment, the invention provides a method of diagnosing and treating a disease caused by trypsin or TMPRSS2, the method comprising detecting the presence or absence of 00502 protein, a homolog of 00509 protein, or a homolog of 00509 protein in a biological sample from a subject; when the presence of the above protein or homolog is not detected, then the subject is indicated to have a disease caused by trypsin or TMPRSS 2; while the presence of the protein or homolog is not detected, administering to the subject an effective amount of: a bacterium having 00502 protein, a bacterium having a protein which has 30% or more of sequence identity to the amino acid sequence of 00502 protein and has trypsin binding ability, a bacterium having 00509 protein, a bacterium having a protein which has 30% or more of sequence identity to the amino acid sequence of 00509 protein and has trypsin binding ability, 00502 protein, a protein which has 30% or more of sequence identity to the amino acid sequence of 00502 protein and has trypsin binding ability, 00509 protein, or a protein which has 30% or more of sequence identity to the amino acid sequence of 00509 protein and has trypsin binding ability.
In one embodiment, the present invention provides a method for diagnosing and treating a disease caused by trypsin or TMPRSS2, the method comprising detecting in a biological sample derived from a subject the presence or absence of a bacterium having 00502 protein, a bacterium having 00502 protein homolog, a bacterium having a gene comprising a base sequence as shown in SEQ ID NO:1, a bacterium having a gene having 30% or more sequence identity to a base sequence as shown in SEQ ID NO:1 and encoding a protein having trypsin binding capacity, a bacterium having 00509 protein homolog, a bacterium having a gene comprising a base sequence as shown in SEQ ID NO:2, or a bacterium having a gene having 30% or more sequence identity to a base sequence as shown in SEQ ID NO:2 and encoding a protein having trypsin binding capacity; when the presence of bacteria having the above proteins or homologs or genes is not detected, then the subject is indicated to have a disease caused by trypsin or TMPRSS 2; while the presence of bacteria having the above proteins or homologs or genes is not detected, an effective amount of the following is administered to the subject: a bacterium having 00502 protein, a bacterium having a protein having 30% or more of sequence identity to the amino acid sequence of 00502 protein and having trypsin binding ability, a bacterium having a gene comprising the base sequence shown in SEQ ID NO. 1, a bacterium having a gene having 30% or more of sequence identity to the base sequence shown in SEQ ID NO. 1 and encoding a protein having trypsin binding ability, a bacterium having 00509 protein, a bacterium having a protein having 30% or more of sequence identity to the amino acid sequence of 00509 protein and having trypsin binding ability, a bacterium having a gene comprising the base sequence shown in SEQ ID NO. 2, a bacterium having a gene having 30% or more of sequence identity to the base sequence shown in SEQ ID NO. 2 and encoding a protein having trypsin binding ability, 00502 protein, a protein having 30% or more of sequence identity to the amino acid sequence of 00502 protein and having trypsin binding ability, 00509 protein, or a protein having 30% or more of sequence identity to the amino acid sequence of 00509 protein and having trypsin binding ability.
In one embodiment, the present invention provides a composition for treating diseases caused by trypsin or TMPRSS2, comprising as active ingredients the following bacteria: a bacterium having 00502 protein, a bacterium having a protein having 30% or more of sequence identity to the amino acid sequence of 00502 protein and having trypsin binding ability, a bacterium having a gene comprising the base sequence shown in SEQ ID NO. 1, a bacterium having 30% or more of sequence identity to the base sequence shown in SEQ ID NO. 1 and encoding a protein having trypsin binding ability, a bacterium having 00509 protein, a bacterium having 30% or more of sequence identity to the amino acid sequence of 00509 protein and having trypsin binding ability, a bacterium having a gene comprising the base sequence shown in SEQ ID NO. 2, or a bacterium having 30% or more of sequence identity to the base sequence shown in SEQ ID NO. 2 and encoding a protein having trypsin binding ability.
In one embodiment, the present invention provides a pharmaceutical composition for treating diseases caused by trypsin or TMPRSS2, comprising as active ingredients the following proteins: 00502 protein, a protein having 30% or more sequence identity with the amino acid sequence of 00502 protein and having trypsin binding capacity, 00509 protein, or a protein having 30% or more sequence identity with the amino acid sequence of 00509 protein and having trypsin binding capacity.
In one embodiment, the invention provides a method for preparing a pharmaceutical composition for treating a disease caused by trypsin or TMPRSS2, the method comprising using: a bacterium having 00502 protein, a bacterium having a protein having 30% or more of sequence identity to the amino acid sequence of 00502 protein and having trypsin binding ability, a bacterium having a gene comprising the base sequence shown in SEQ ID NO. 1, a bacterium having a gene having 30% or more of sequence identity to the base sequence shown in SEQ ID NO. 1 and encoding a protein having trypsin binding ability, a bacterium having 00509 protein, a bacterium having a protein having 30% or more of sequence identity to the amino acid sequence of 00509 protein and having trypsin binding ability, a bacterium having a gene comprising the base sequence shown in SEQ ID NO. 2, or a bacterium having 30% or more of sequence identity to the base sequence shown in SEQ ID NO. 2 and encoding a protein having trypsin binding ability.
In one embodiment, the invention provides a method for preparing a pharmaceutical composition for treating a disease caused by trypsin or TMPRSS2, the method comprising using: 00502 protein, a protein having 30% or more sequence identity with the amino acid sequence of 00502 protein and having trypsin binding capacity, 00509 protein, or a protein having 30% or more sequence identity with the amino acid sequence of 00509 protein and having trypsin binding capacity.
[ example ]
The present invention will be described in more detail by way of examples, but the present invention is not limited to the following examples.
[ materials and methods ]
(mice)
C57BL/6 mice were kept under Specific Pathogen Free (SPF) or sterile (GF) conditions, purchased from Sankyo Laboratories Japan, SLC Japan, charles River Japan or CLEA Japan. GF mice and sydney (gnotobiotic) mice are bred and maintained in sterile facilities at the center of life sciences and medical biosciences of the japan physical and chemical institute. All animal experiments were approved by the animal experiment committee of the institute of physical chemistry, the lazy research institute.
(bacterial Strain)
Paraprevotella clara (JCM 14859), paraprevotella xylaniphila (JCM 14860), paraprevotella copri (JCM 13464), paraprevotella denticola (JCM 13449), paraprevotella stercorea (JCM 13469) and Paraprevotella oulorum (JCM 14966) are obtained from Japan Collection of Microorganisms (JCM). Paraprevotella clara (P237E 3B) and (P322B 5) are provided by Vedanta Biosciences. Paraprevotella xylaniphila (82A 6) was isolated by the inventors.
(proteomic analysis of cecal content)
Proteins in cecal content were extracted by pipetting and mixing with Tris-buffered saline (TBST) containing protease inhibitors. The supernatant was transferred to a fresh tube, 25% trichloroacetic acid (final concentration 12.5% v/v) was added and incubated for 1 hour at 4 ℃. The supernatant was subsequently removed by centrifugation at 15,000Xg for 15 minutes at 4℃after which the precipitate was washed twice with acetone, uncovered and dried. The dried sample was then redissolved in 0.5% sodium laurate and 100mM Tris-HCl (pH 8.5) using a water bath sonicator (biosoprotor UCD-200,SonicBio Corp). The protein concentration of the redissolved sample was determined by biuret acid (BCA) and adjusted to 1 μg/μl. Pretreatment of proteomic analysis was performed according to the reported method (Kawashima y., et al Optimisation of Data-Independent Acquisition Mass Spectrometry for Deep and Highly Sensitive Proteomic Analysis, int J Mol sci.20 (23), 5932,2019).
The peptide was injected directly into a 75 μm by 15cm PicoFrit emitter (New Objective), which was internally filled with 2.7 μm CORE-shell C18 particles (CAPCELL CORE MP 2.7 μm,materialOsaka thred limited) and was isolated using a Eksigent ekspert nanoLC HPLC system (Sciex) with a gradient flow rate of 300nL/min for 180 minutes. />
Peptide fragments eluted from the column were analyzed by Shotgun MS and SWATH (Sequential Window Acquisition of All Theoretical Mass Spectra) -MS using a TripleTOF 5600+ mass spectrometer (Sciex). In experiments using Shotgun MS, MS1 spectra were collected at 250 MS intervals in the range of 400-1000 m/z. The first 25 precursor ions with charge states from 2+ to 5+ and counts greater than 150 times/second were selected, fragmented using rolling collision energy, and MS2 spectra were collected at 100 MS intervals in the range of 100-1500 m/z. The dynamic offset time is set to 24 seconds.
In the SWATH-MS experiment, the mass spectrometer was run in a continuous data independent acquisition mode, with a 12m/z increase in the precursor separation window each time. The separation width was 13m/z (window overlap was 1 m/z), and 50 window sets covering the precursor mass range of 400-1000m/z were constructed. SWATH MS2 spectra were run in the range of 100-1500m/z, with each MS2 experiment being run at 60MS intervals in the range of 100-1500 m/z. In each MS2 experiment, precursor ions were fragmented using rolling collision energy.
All Shotgun MS files were matched to the mouse UniProt reference proteome (UniProt id UP000000589, reviewed, cannical) using the proteo software v.4.5 and the Paragon algorithm (Sciex) for protein identification.
The reliability threshold for the protein was set to an unused fraction of 1.3 for ProteinPilot and a 95% confidence level of at least one peptide was required. In this study, the overall false discovery rate for both peptides and proteins was less than 1%. The identified proteins were quantified from SWATH-MS data using PeakView v.2.2 (Sciex).
(P.clara culture supernatant proteomic analysis).
25% trichloroacetic acid (final concentration 12.5% v/v) was added to the P.clara culture supernatant and incubated at 4℃for 1 hour. The supernatant was subsequently removed by centrifugation at 15,000Xg for 15 minutes at 4 ℃. The precipitate was washed twice with acetone, uncovered and dried. The dried sample was then redissolved in 0.5% sodium laurate and 100mM Tris-HCl (pH 8.5) using a water bath sonicator (Biorupter UCD-200), and the protein concentration of the redissolved sample was measured by BCA and adjusted to 1. Mu.g/. Mu.L. Pretreatment of Shotgun proteomic analysis was performed as described above.
The peptide was injected directly into a 75 μm by 20cm PicoFrit emitter filled with 2.7 μm core-shell C18 particles. The separation was then carried out at 50℃using an UltiMate 3000RSLCnano LC system (Thermo Fisher Scientific) with a gradient flow rate of 100nL/min for 80 minutes. The peptide fragments eluted from the column were analyzed by overlap window DIA-MS using Q exact HF-X (Thermo Fisher Scientific). MS1 spectra were collected in the 495-785m/z range with a resolution of 30,000 and the target value for automatic gain control set to 3 Xe 6 The maximum sample injection time was set to 55.
MS2 spectra were collected in a range of 200m/z with a resolution of 30,000 and the target value for automatic gain control was set to 3 Xe 6 The maximum sample injection time was set to "auto", the stepwise normalized collision energy was set to 22, 26 and 30%, and the isolation window width of MS2 was set to 4m/z. An overlapping window pattern of 500-780m/z is used. Window configuration was optimized using Skyline software.
MS files were matched to the P.clara spectral library using Scaffold DIA (Proteome Software). The spectral library was generated using Prosit software from the protein sequence database of P.clara (Unit Prot id UP000000589, reviewed, cannical).
The clara protein sequence database was self-created by metagenomic analysis. The search parameters of the Scaffold DIA are shown below. Experimental data retrieval enzyme: trypsin, the most wrong cleavage site: precursor mass tolerance range: 8ppm, fragment mass tolerance range: 8ppm, static modification: aminomethylation of cysteines.
The threshold for protein identification was set to give false discovery rates of peptides and proteins below 1%. The quantification of peptides was calculated by the EncyclopeDIA algorithm of Scaffold DIA. The 4 highest mass fragment ions were selected for each peptide fragment and quantified. The quantitative value of the protein is estimated from the sum of the quantitative values of the peptides.
(Peptidomimetic analysis)
Acetonitrile (ACN) containing 0.1% trifluoroacetic acid (TFA) was added to the cecal content and dried in a centrifugal evaporator. Acetone was added to the dried sample, and the low molecular weight compounds in the lipid were dissolved by a water bath sonicator, and then centrifuged at 15,000Xg for 15 minutes at 4 ℃.
After removal of the supernatant, 70% ACN-HCl was added to the precipitate, the peptide was redissolved in a water bath sonicator and centrifuged at 15,000Xg for 15 min at 4 ℃. The supernatant was then transferred to a new tube and dried in a centrifugal evaporator. The dried samples were then redissolved in 100mM Tris-HCl containing protease inhibitors and treated with 10mM dithiothreitol at 50℃for 30 minutes. Alkylation treatment with 30mM iodoacetamide was then carried out for 30 minutes at room temperature in the dark and acidified with 0.5% trifluoroacetic acid (final concentration). The acidified samples were desalted with Monospin C18 (GL Sciences).
The peptide was injected directly into a 75 μm x 25cm PicoFrit emitter (New Objective), which was filled with C18 CORE-shell particles (CAPCELL CORE MP 2.7 μm,osaka threda). The separation was then carried out at 50℃using an UltraMate 3000RSLCnano LC system (Thermo Fisher Scientific) with a gradient flow rate of 100nL/min for 90 minutes.
The peptide fragment eluted from the column was analyzed by DDA-MS using Q exact HF-X (Thermo Fisher Scientific). MS1 spectra were collected in the 350-1500m/z range with a resolution of 120000 to achieve an Automatic Gain Control (AGC) target value of 3×10 6
Using a data dependent mode to normalize collision induced dissociation at a collision energy of 26% for charge states from 2+ to 5+ and intensities exceeding 4.4X10 3 The 30 strongest ions of (2) were crushed and tandem mass spectrum was obtained by Orbitrap mass spectrometer at 200m/z with a mass resolution of 30000. The AGC target value is set to 1×10 5
MS files were retrieved in the mouse UniProt reference proteome (UniProt id UP000000589, review, canonical) PEAKS Studio.
The search conditions are as follows: the precursor mass tolerance range was 8ppm, the fragment ion mass tolerance range was 0.01Da, the enzyme was present, the immobilized modification was aminomethylated, and the variable modification was oxidized (M). Peptides were filtered and identified to ensure that the false discovery rate of peptides was below 1%.
(Western Blot)
The mouse fecal samples were suspended in Phosphate Buffered Saline (PBS) for 50-fold dilution, supplemented with protease inhibitor cocktail (roche). The suspension samples were then centrifuged at 15,000Xg for 10 minutes at 4℃and the supernatant was subjected to Western Blot. Mouse pancreatic tissue was snap frozen in liquid nitrogen and protein was extracted with TRIzol reagent (Thermo Fisher Scientific) at a final protein concentration of 4. Mu.g/. Mu.L. Novex (R) NuPAGE (R) SDS-PAGE Gel System (Thermo Fisher Scientific) and iBlot were used TM 2Dry Blotting System (Thermo Fisher Scientific) was subjected to SDS-PAGE and blotting, and used according to the manufacturer's instructions.
In some experiments, SDS-PAGE and transfer of PVDF membrane (0.2. Mu. mTransfer Membranes Immobilon-PSQ, merck Millipore) was performed according to the manufacturer's instructions (XV PANTERA SYSTEM (DRC)).
By iBind TM Western Systems (Thermo Fisher Scientific) staining.
The antibodies used are shown below. Rabbit anti-mouse PRSS2 antibody (cosmo bio), anti-mouse HSP90 antibody (# 4877,Cell Signaling Technology), rabbit anti-human PRSS2 antibody (LS-B15726, LSBio), rabbit anti-human PRSS1 antibody (LS-331381, LSBio), rabbit anti-mouse TMPRSS2 antibody (LS-C373022, LSBio, sequence directed against protease domain), rabbit anti-6-His antibody (a 190-214A,Bethyl laboratories), goat anti-mouse IgA alpha chain antibody (HRP) (ab 97235, abcam), rat anti-mouse kappa chain antibody (HRP) (ab 99632, abcam), rabbit anti-mouse CELA3B antibody (OACD 03205, aviva sybio), anti-rabbit IgG antibody (HRP) (# 7074,Cell Signalling Technology), rabbit anti-mouse Reg3 antibody (51153-R005, sino Biological).
Chemi-Lumi One (Nacalai Tesque) for chemiluminescent enzyme-linked immunosorbent assay, molecular imaging (R) Chemidoc TM XRS+System (Bio-Rad) or iBright TM FL1500 is used for imaging.
(RT-qPCR)
RNA from mouse pancreas was extracted with TRIzol reagent (Thermo Fisher Scientific). The extracted RNA was transcribed into cDNA using ReverTra Ace (R) qPCR RT Master Mix with gDNA Remover (Toyobo).
RT-qPCR analysis was performed using Thunderbird SYBR qPCR Mix (Toyobo) and Lightcycler 480 (Roche), and analysis was performed using the ΔΔCt method. GAPDH was used as an endogenous control. The base sequences of the primers used are shown below. GAPDH Forward primer:5'-GTCGTGGAGTCTACTGGTCTTC-3' (SEQ ID NO: 35)
GAPDH Reverse primer:5'-GTCATATTTCTCGTGGTTCACACC-3'(SEQ ID NO:36)
PRSS2 Forward primer:5'-TGTGACCCCCTCAATGCCAGAG-3'(SEQ ID NO:37)
PRSS2 Reverse primer:5'-AGCACTGGGGCATCAACAC-3'(SEQ ID NO:38)。
(immunofluorescent staining)
The rat intestinal tissue (including feces) was collected and fixed overnight with Kornoi solution (60% methanol, 30% chloroform, 10% glacial acetic acid) at 4 ℃. Paraffin embedding was performed using a tissue processor (Leica Microsystems). Paraffin blocks were then cut into thin slices (5.0 μm) using a microtome, paraffin removed and immunostained.
Antibodies used in immunofluorescence are as follows: rabbit anti-PRSS 2 antibody (LSBio), alexa 488-labeled goat anti-rabbit IgG antibody (Thermo Fisher Scientific), 4' -6-diamidino-2-phenylindole (DAPI) (college of chemistry), rhodamine-labeled UEA1 (Ulex Europaeus Agglutinin 1,Vector Laboratories). Immunofluorescence imaging was performed using Leica AF600 and confocal Leica TCS SP5.
(measurement of trypsin Activity in mouse and human faecal samples)
The intestinal contents or stool samples of mice were diluted 500-fold (w/v) in 0.9% NaCl solution. Human feces were diluted 200-fold (w/v) in 0.9% sodium chloride solution. The diluted solution was vortexed in a small shaker at 2000rmp for 20 minutes, homogenized by a pipette, and centrifuged at 10,000Xg for 15 minutes at 4 ℃. The supernatant was then collected and trypsin activity was measured using Trypsin Activity Assay Kit (colorimeter) 100tests (ab 102531) according to the manufacturer's protocol. Absorbance measurements were performed in kinetic mode at 405nm wavelength using PerkinElmer 2030Multilabel Reader.
(colonization of human microbiota in GF mice)
Human stool samples were collected at the institute of chemical and university of Qing according to the institutional review board approved study protocol. Informed consent was obtained for each subject. Human fecal samples (stored in 20% (v/v) glycerol) were transferred to an anaerobic chamber, thawed, filtered through a 100 μm mesh screen, transferred to GF isolator, and orally administered to GF mice at a dose of 200 μl/mouse.
Antibiotic treatment was prepared using the following solutions: a solution of 0.5g/L ampicillin (Nacalai Tesque), 0.5g/L metronidazole (Nacalai Tesque) and 1.0g/L tylosin (Sigma) was prepared using autoclaved tap water. Mice orally administered the fecal content of donor C were continuously given antibiotic solution for 12 days. The antibiotic solution was changed once a week.
(isolation and identification of colony forming species from the fecal content of mice)
The intestinal contents of the mice were mixed with PBS containing glycerol (20%), placed in an anaerobic chamber and stored at-80 ℃. In the anaerobic chamber, the stock was mixed with an equal volume of TS solution (BD) and smeared on different agar plates as shown below: EG. ES, M10, NBGT, VS, TS (BD), BL (Rongsheng chemical), BBE (Jidong pharmaceutical), oxoid CM0619 (Thermo Fisher Scientific), CM 0619-supported SR0107 (Thermo Fisher Scientific), CM 0619-supported SR0108 (Thermo Fisher Scientific), mGAM (Niday Water pharmaceutical), schaedler (BD). After 2 days of incubation, colonies with different appearances were transferred to new EG plates. The colonies were then grown overnight in EGEF broth, mixed with glycerol (final concentration 20% (v/v)) and stored at-80 ℃.
The formulation of EG (Eggerth Gagnon) agar medium is shown below. Peptone No. 3 (10.0 g), yeast extract (5.0 g), na 2 HPO 4 (4.0 g), glucose (1.5 g), soluble starch (0.5 g), L-cysteine hydrochloride (0.5 g), L-cysteine (0.2 g), tween 80 (0.5 g), agar (4.8 g), horse meat extract (500 mL), up to 1000mL water+defibrinated horse blood (50 mL); in EGEF medium, defibrinated horse blood (50 mL) was replaced with Fildes solution (40 mL) in addition to agar.
The bacterial DNA genome was then extracted from the isolate using the same protocol as DNA was isolated from feces. 16S rRNA was amplified by PCR using KOD plus Neo Kit (Toyobo) according to the manufacturer' S protocol. Sanger sequencing work was performed by the Eurofins company. The nucleotide sequence was matched to the NCBI database. Primer sequences used for Sanger sequencing are shown below. F27 primer:5'-AGRGTTTGATYMTGGCTCAG-3' (SEQ ID NO: 39) R1492 primer:5'-TACGGYTACCTTGTTACGACTT-3' (SEQ ID NO: 40)
(16S rRNA sequencing)
Frozen mouse faecal samples were thawed and 100. Mu.L of the suspension was mixed with 900. Mu.L of TE10 (10 mM Tris-HCl,10mM EDTA) buffer containing RNase A (final concentration 100. Mu.g/mL, thermo Fisher Scientific) and lysozyme (final concentration 3.0mg/mL, sigma). The suspension was incubated at 37℃for 1 hour with gentle agitation. Purified aromatic peptidase (and light) was added to a final concentration of 2,000 units/mL and incubated for a further 30 minutes at 37 ℃. Sodium dodecyl sulfate (1% final concentration) and proteinase K (1 mg/mL final concentration, nacalai) were then added to the suspension and incubated for 1 hour at 55 ℃.
Then phenol is used for: chloroform: the polymerized DNA was extracted with isoamyl alcohol (25:24:1), precipitated with isopropanol, washed with 70% ethanol and resuspended in 100. Mu.L of TE.
PCR analysis was performed using Ex taq (Takara). Primer sequences for amplifying the V1-V2 region of the 16S rRNA gene are as follows. 27Fmod primer:5'-AATGATACGGCGACCACCGAGATCTACACxxxxxxxxACACTCTTTCC CTACACGACGCTCTTCCGATCTAGRGTTTGATYMTGGCTCAG-3' (SEQ ID NO:41, "xxxxxxxx" means the Miseq (Illumina) index sequence).
338R primer:5'-CAAGCAGAAGACGGCATACGAGATxxxxxxxxGTGACTGGAGTTCAGAC GTGTGCTCTTCCGATCTTGCTGCCTCCCGTAGGAGT-3' (SEQ ID NO:42, "xxxxxxxxxxxx" means the Miseq (Illumina) index sequence).
The PCR products were purified on a Agencourt AMPure XP (Beckman Coulter) according to the manufacturer's protocol. The 16S rRNA library was prepared according to the protocol using Kapa library quantification Kit (Kapa Biosystems). 16S rRNA sequencing was performed using the standard protocol of MiSeq Reagent kit ver 3.
(Gnetobiotic experiment)
The isolated strain was cultured in an anaerobic chamber at 37℃for 1-2 days except Phasolactobacterium faecium (3G 4). Phasolactobacterium faecium was cultured on an Oxoid CM0619 agar plate supplemented with 80mM sodium succinate for 2-3 days, colonies were collected and resuspended in EGEF medium. According to OD 600 Values bacterial density was adjusted and the mixture of cultured strains was orally administered to GF mice (150 μl/mouse, total bacteria number about 1-2×10) 8 CFU)。
To quantify p.clara DNA in feces, mouse fecal DNA was purified and the base sequence of the 16s rRNA gene of the p.clara strain was amplified in the LightCycler 480 system (roche) using THUNDERBIRD SYBR qPCR Mix (eastern ocean). Standard curves were prepared from serial dilutions of genomic DNA of the p.clara (JCM 14859) strain. The primer sequences used are shown below.
5'-CGTAGGAGTTTGGACCGTGT-3'(SEQ ID NO:43)
5'-GCATGGGAGCGACAAATAAA-3'(SEQ ID NO:44)
(Citrobacter rodentium (C. Condensing) infection)
GF mice were orally administered 200. Mu.L of 2-mix (B.uniformis and P.merdae) +P.clara (WT), 2-mix+P.clara (Δ00502) or only 2-mix. After 14 days, mice were infected with overnight cultured c.rodentum (150 μl/mouse) orally and euthanized on day 7.
For histological analysis, mice necks were fixed with 4% paraformaldehyde and paraffin embedded. Paraffin blocks were cut and stained with hematoxylin and eosin.
The extent of colitis was assessed by gastroenterologists according to the following criteria: inflammatory cell infiltration (score, 0-4), mucosal thickening (score, 0-4), goblet cell depletion (score, 0-4), crypt abscess (score, 0-4), and tissue structure destruction (score, 0-4). The final histological score is defined as the sum of the scores of these parameters.
For Colony Forming Unit (CFU) detection, cecal plaque or cecal lumen contents were collected, homogenized in PBS, and gradually diluted homogenates were placed on LB agar plates for colony streaking. CFU were counted after overnight incubation at 37 ℃ under aerobic conditions.
For in vitro evaluation of C.rodenium-specific IgA, the cecal content was resuspended in 5-fold (w/v) LB medium, centrifuged and the supernatant was filtered through a PVDF sterile filter device with a pore size of 0.22 μm and then mixed with an equal volume of in vitro overnight C.rodenium medium. Mix with an equal volume of in vitro overnight c. The mixture was incubated at room temperature with gentle shaking for 1 hour, and the agglutination effect was examined under a confocal microscope (Leica TCS SP 8). Alternatively, after incubation, the mixture was centrifuged, washed once with PBS, and the bacterial particles were lysed in 1% SDS solution (diluted in 50mM Tris-HCl buffer with 5mM EDTA). Lysates were Western Blot stained with goat anti-mouse IgA alpha chain antibody (HRP) (ab 97235) to assess the relative amount of IgA in the fecal content that bound to c.
For vaccination experiments, GF mice were pre-dosed with 200. Mu.L of 2-mix (B.uniformis and P.merdae) +P.clara (WT) or 2-mix+P.clara (Δ00502). After 4 days, peracetic acid inactivated c.rodentum (10) 10 Individual cells/mouse) 1 time per week for 3 weeks. After 3 weeks of immunization, mice were infected with overnight-cultured c.rodenium (150 μl/mouse) and euthanized on day 14.
The preparation of C.rodenium inactivated by oxyacetic acid is shown below. C.rodentium cultured overnight was collected by centrifugation (16,000Xg, 10 min) and at 10 per ml 10 The density of individual cells was resuspended in sterile PBS. Peracetic acid (240990, sigma) was added to a final concentration of 0.4% and incubated for 1 hour at room temperature. After three washes with sterile PBS, the final pellet was resuspended in PBS at a final concentration of 10 11 particles/mL and stored at 4 ℃. 100. Mu.L of the solution was quenched before useLive vaccine was inoculated into 200mL of LB medium and incubated overnight at 37 ℃ to ensure complete inactivation.
(mouse hepatitis Virus (MHV) infection)
MHV-2 is supplied by Makoto Ujiie (Japanese university of veterinary and Life sciences). GF C57BL/6N male mice at 5 weeks of age were obtained from CLEA Japan or Sankyo Lab Services and housed in separate stainless steel isolators. GF mice were orally administered 200. Mu.L of 2-mix (B.uniformis and P.merdae) +P.clara (WT) or 2-mix+P.clara (Δ00502). Two weeks after inoculation, the inoculation was performed with 4.5X10 6 Mice were infected orally with MHV-2 of PFU, and survival was observed daily for 10 days.
To determine viral titers, livers and brains were collected on day 4 or 5 post-infection and homogenized with DNA/RNA Shield (Zymo Research). Viral RNA was extracted using the Quick-RNA visual Kit (Zymo Research) and cDNA was synthesized using the ReverTra Ace (Toyobo) and random primers (Toyobo) according to the manufacturer's instructions.
Quantitative real-time PCR was performed on the LightCycler 480 system (roche) using THUNDERBIRD SYBR qPCR Mix (eastern spun) to amplify the orf1a gene. The primer sequences used are shown below.
5'-AAGAGTGATTGGCGTCCGTAC-3'(SEQ ID NO:45)
5'-ATGGACACGTCACTGGCAGAG-3'(SEQ ID NO:46)
(degradation of trypsin in vitro)
An overnight culture of bacteria was incubated with recombinant mouse trypsin (final concentration 1. Mu.g/mL) for 1 hour or with human trypsin (final concentration 20. Mu.g/mL) for 4 hours. The recombinant trypsin isoforms used were recombinant mouse PRSS2 (50383-M08H, both messenger and organism), recombinant human PRSS1 (LS-G135640), recombinant human PRSS2 (LS-G20167) and recombinant human PRSS3 (His-tag) (NBP 2-52220).
In some experiments, recombinant mouse PRSS2 was treated with one of the following trypsin inhibitors for 30 minutes prior to incubation with p.clara cultures: AEBSF (Sigma, final concentration 2 mM), leupeptin (Sigma, final concentration 100. Mu.M), TLCK (Abcam, final concentration 100. Mu.M).
In some experiments, p.clara was incubated overnight in the presence of: tunicamycin (Sigma, final concentration 10. Mu.g/mL), 2-fluoro-L-fucose (Cayman Chemical, final concentration 250. Mu.M), or corresponding DMSO controls.
In the evaluation of Ca 2+ In the experiments affected, P.clara was cultured with or without 1mM Ca prior to incubation with recombinant mouse PRSS2 2+ Low Ca of (2) 2+ In mGAM medium. In experiments using p.clara supernatant, overnight cultured p.clara was filtered in a sterile filtration device with a PVDF membrane having a pore size of 0.22 μm.
(confocal microscope)
Using Alexa Fluor TM 488 antibody labelling kit (A20181, thermoFisher) recombinant mouse PRSS2 was labelled and pretreated with AEBSF inhibitor (150. Mu.g/mL rmPRSS2 with 20 mMAEBSF).
Alexa Fluor is used TM 488-labeled mice PRSS2 were incubated with bacteria at a final concentration of 5 μg/mL in an anaerobic chamber for 20 minutes. The mixture was centrifuged, washed once with PBS and resuspended in PBS. Confocal images were captured using a Leica TCS SP8 confocal microscope.
(crosslinked disuccinimidyl sulfone (DSSO))
DSSO (a 33545) was purchased from Thermo Fisher Scientific. The overnight cultured p.clara (1C 4) strain was incubated with AEBSF treated mouse recombinant PRSS2 for 20 min, washed once with PBS and resuspended in 10mM DSSO. After incubation for 10 min at room temperature, the reaction was stopped by adding Tris-HCl buffer (final concentration 20 mM).
After washing with PBS, the particles were dissolved in 1% SDS solution (diluted in 50mM Tris-HCl buffer containing 5mM EDTA). Only p.clara (1C 4) strain (not hatched with PRSS 2) was similarly treated and served as a negative control. The supernatant was stained with rabbit anti-6-His antibody (A190-214A,Bethyl laboratories) and anti-rabbit IgG antibody (HRP) (# 7074,Cell Signalling Technology) and analyzed by Western Blot.
(Whole cell lysate, supernatant, glycoprotein-containing protein staining)
The P.clara (1C 4) strain was isolated in the presence of tunicamycin (Sigma, final concentration 10. Mu.g/mL), 2-fluoro-L-fucose (Cayman Chemical, final concentration)Degree 250 μm) or the corresponding DMSO control. The cultured bacteria were then precipitated, washed once with PBS, and dissolved in 1% SDS solution (diluted in 50mM Tris-HCl buffer containing 5mM EDTA) and subjected to SDS-PAGE using Novex (R) NuPAGE (R) SDS-PAGE Gel System (Thermo Fisher Scientific). Use of Pro-Q according to manufacturer's protocol TM Emerald 300Glycoprotein Gel and Blot Stain Kit (Thermo Fisher Scientific) stained for glycoproteins. Proteins from whole cell lysates were stained with Colloidal Blue Staining Kit (Thermo Fisher Scientific). The proteins in the supernatant were concentrated in Amicon Ultra Centrifugal Filters (10 kD NMWL) and stained with Colloidal Blue Staining Kit (ThermoFisher Scientific).
(mutant preparation)
The preparation of deletion mutants (Delta03049-03053, delta 000502 and Delta 000509) of the P.clara (JCM 14859) strain is shown below. First, a sequence of about 1kb between the coding regions was amplified by PCR and incorporated into suicide vector pLGB30 using HiFi DNA Assembly (NEB) according to the manufacturer's protocol. Then, 1. Mu.L of each reaction solution was transformed into inductive E.coli S17-1. Lambda. Pir.
The transformants were then combined with the p.clara (JCM 14859) strain as follows. Culturing donor strain and recipient strain in LB and EGEF medium respectively to OD 600 0.5 and mixed in a ratio of 1:1. The mixture was dropped onto an EGEF agar plate and aerobically incubated at 37℃for 16 hours. The ligation completed bodies were then selected on EGEF agar plates containing tetracycline (10. Mu.g/mL). The ligation completed body was partially sensitive to rhamnose-induced ss-bfe1 toxin expression and growth was inhibited in the presence of 10mM rhamnose (overnight OD 600 About 0.3). Subsequently, for the disappearance of the plasmid from the genome due to the second hybridization, the ligation completed body was cultured at least three times in EGEF medium containing 10mM rhamnose until the ligation completed body was defeated by the revertants (overnight OD 600 Reaching-1.0). Then, bacterial culture was performed, and individual colonies were collected, and normal deletion was confirmed by PCR.
A homologous sequence of about 0.5-1kb in the coding region was inserted into suicide vector pLGB30 and transformed into inductive E.coli S17-1 strain, and an insertional mutant was prepared by the same procedure.
Transformants were ligated with the P.clara (JCM 14859) strain using the same protocol, and the ligation completed was selected on EGEF agar plates containing tetracycline (10. Mu.g/mL), confirmed by PCR, and maintained in EGEF medium containing tetracycline (10. Mu.g/mL). The sequences of all the primers used to prepare the mutants are listed in tables 1-5 below.
[ Table 1 ]
/>
[ Table 2 ]
/>
[ Table 3 ]
/>
[ Table 4 ]
[ Table 5 ]
(Transmission electron microscope (TEM))
The Wild Type (WT) or.DELTA.00502 P.clara (JCM 14589) strain was incubated with recombinant mouse PRSS2 (50383-M08H, sino Biological, final concentration 5. Mu.g/mL) for 20 min, washed with PBS, 4% paraformaldehyde-1% glutaraldehyde solution, and fixed at room temperature for 2 hours. After washing with 0.05M PBS, the mixture was dehydrated stepwise with ethanol (50%, 70%, 80%, 90%, 95%, 100%). The dehydrated pellets were infiltrated with LRW resin (100% ethanol and LRW 1:1 for 1 hour, then 100% ethanol and LRW 1:2 for overnight, 5 hours with 100% LRW). After infiltration, the samples were cured in gelatin capsules (53 ℃,24 hours). The polymerized LRW blocks were sectioned with Leica Ultracut UCT to obtain 80nm sections.
For immunostaining, sections were blocked with 0.05M PBS plus 1% BSA, and then stained with rabbit anti-6-His antibody (A190-214A,Bethyl laboratories) for 60 min. After washing with 0.05M PBS, the sections were stained with 12nm colloidal gold labeled goat anti-rabbit IgG antibody for 60 minutes. After washing again with 0.05M PBS, the sections were fixed with 1% glutaraldehyde in 0.05M PBS, washed with water and stained with uranium acetate for 5 minutes. All images were taken with a JEOL JEM-1400 transmission electron microscope.
(expression of recombinant proteins and binding to magnetic microbeads)
To prepare recombinant 00502 and recombinant 00509 proteins, the gene coding regions of both proteins (excluding the N-terminal sequence encoding the signal peptide) were cloned into the expression vector pET-28b (+) (# 69865, novagen) and a His-tag was introduced at the C-terminus according to the supplier's protocol.
The expression vector was then transformed into Rosetta-gami B (DE 3) component Cells (# 71136, novagen). The transformants were then grown exponentially and protein expression was induced by addition of 0.4mM IPTG (I6758, sigma). After incubation at 25℃overnight, B-PER was used TM Bacterial Protein Extraction Reagent (# 78243,Thermo Fisher Scientific) lyse cells using Pierce TM Ni-NTA magnetic agarose beads (# 78605,Thermo Fisher Scient)ific) lyse cells and use Pierce TM Polyacrylamide Spin Desalting Columns (# 89849,Thermo Fisher Scientific) recombinant 00502 and 00509 proteins were prepared.
Dynabeads is then used TM Antibody Coupling Kit (14311D, thermo Fisher Scientific), purified recombinant 00502 protein, recombinant 00509 protein or bovine serum albumin (# 23209,Thermo Fisher Scientific) was combined with microbeads (Dynabeads) according to the manufacturer's protocol TM ) And (5) combining. To each mg of magnetic beads 15. Mu.g of protein was added.
1mg of protein-bound Dynabeads were then used TM Resuspended in 200. Mu.L of EGEF medium and combined with recombinant mouse PRSS2 (final concentration 3. Mu.g/mL), AEBSF-pretreated Alexa Fluor TM 488-labeled mice PRSS2 (final concentration 5. Mu.g/mL) or 50. Mu.L GF cecal content (50-fold diluted in PBS) were mixed. The sequences of all primers used to prepare recombinants are shown in Table 6 below.
[ Table 6 ]
(protease Activity measurement)
Using Pierce TM Fluorescent Protease Assay Kit the protease activity of the P.clara broth, P.clara broth supernatant, recombinant 00502 protein and recombinant 00509 protein was determined according to the manufacturer's protocol (# 23266,Thermo Fisher Scientific).
FITC-casein substrate was digested into smaller fluorescein-labeled fragments using a Perkinelmer 2030Multilabel Reader with fluorescein excitation and emission filters (485/538 nm). An increase in total fluorescence was detected. Protease activity is measured as a change in Relative Fluorescence Units (RFU).
(Meta-genome analysis of human fecal samples in public queues)
The metagenome of human stool samples from the PRISM, HMP2, FHS, 500FG, CVON, and Jie cohorts were assembled from scratch into non-redundant gene catalogs and assigned to metagenomic species using MSPminer to quantify relative abundance.
To retrieve homologues of trypsin-related genes from p.clara and p.xylanphilia strains, including genes encoding 00502 protein and 00509 protein, and six other adjacent genes, in the gene catalog, userchublast (at the protein level) was used and the matching result with a minimum e value of 0.1 was retained. The results confirm that all 8 genes are present in the gene list for each species.
To identify other putative homologs and species encoding the locus, we first assessed the similarity between the corresponding homologs of the p.clara and p.xylanphilia strains and set the minimum identity (Id) and coverage (Cov) thresholds shown below for each gene at the locus. 00502: id=25%, cov=90%; 00503: id=70%, cov=90%; 00504: id=60%, cov=90%; 00505: id=60%, cov=90%; 00506: id=50%, cov=90%; 00507: id=25%, cov=90%; 0.0508: id=45%, cov=80%; 00509: id=20%, cov=30%.
Then, we assessed which other metagenomic species (MPS) encoded homologs of the 00502-00509 genes of the p.clara and p.xylanphilia strains, and found MSP 0355 and MSP 0305, which had 8 and 7 homologs, respectively. MSP 0355 and MSP 0305 were previously only annotated as the doors of bacterioides, but this study compared these proteomes with the Unified Human Gastrointestinal Genome (UHGG) using ublast. The results show that MSP 0355 and MSP 0305 are annotated as GUT_GENOME 140082 and GUT_GENOME 016875, respectively, most of which genes [ ] >90%) at high confidence (median amino acid identity>99, e value<1×e -184 ) Down-mapped onto a single species representing UHGG. In UHGG, both are phylogenetically classified as Paraprevatella species.
(statistics)
All statistical analyses were performed using GraphPad Prism software (GraphPad Software, inc). Multiple comparisons employ one-way analysis of variance and Tukey's test; the comparison between the two groups used either a Welch corrected Mann-Whitney test (non-parametric) or a paired t-test (parametric). The Spearman rank correlation coefficient was used to study the correlation between the two variables. The survival analysis used a log rank (Mantel-Cox) test.
Example 1
(control of trypsin in the large intestine by microbiota)
To investigate the effect of intestinal microbiota on protein distribution in the large intestine, we collected the cecal content of sterile (GF) and pathogen-free (SPF) mice and performed proteomic analysis by mass spectrometry. Among 713 host-derived proteins detected, 324 proteins were found to be higher in SPF mice than GF mice (> 2-fold, p < 0.05). They include immune related molecules such as alpha-defensin 21 (Defa 21) and peptidoglycan recognition protein 1 (Pglyrp 1). On the other hand, 45 proteins were higher in GF mice (> 2-fold, p < 0.05) than SPF mice.
FIG. 1 is a graph showing the results of proteomic analysis. FIG. 1 shows proteins with increased expression levels in the cecum of GF mice compared to SPF mice. Among these, we focused on anionic trypsin (encoded by the Prss2 gene). The level of anionic trypsin (PRSS 2) was significantly elevated in the cecal content of GF mice.
FIG. 2 is a graph showing the results of trypsin activity assay of feces of SPF mice and GF mice. FIG. 3 is a photograph showing the result of Western Blot analysis of feces from SPF mice and GF mice with anionic trypsin (PRSS 2). Fig. 4 is a fluorescence micrograph showing the results of immunostaining of mucus (UEA 1) and anionic trypsin (PRSS 2) detected in large intestine sections of SPF mice and GF mice. Nuclei were also stained with DAPI.
The results of trypsin activity detection of SPF mice and GF mice stool, western Blot analysis of SPF mice and GF mice stool, and immunostaining of large intestine sections were similar to those of proteomic analysis, indicating that the content of anionic trypsin was higher in the large intestine content and stool of GF mice than SPF mice. The anionic trypsin may be referred to below simply as trypsin.
Trypsin is produced in the pancreas in the form of an inactive precursor (trypsinogen) and is then secreted into the duodenum where it is activated by intestinal peptidases. Thus, we studied the expression of trypsinogen in the pancreas of GF mice and SPF mice. The results show that mRNA and protein levels of PRSS2 in the pancreas of GF mice and SPF mice are comparable. These results rule out the possibility that the difference in trypsin level observed in the large intestine is due to the difference in pancreas production.
Intraluminal trypsin activity was then assayed at different locations in the small and large intestines of GF mice and SPF mice. FIG. 5 is a graph showing the results of measurements of trypsin activity in jejunum, ileum, cecum, colon and stool. In fig. 5, "n.s." means no significant difference, "×" means significant difference at P < 0.05.
The results showed similar trypsin levels in SPF mice and GF mice at the distal end of the small intestine. On the other hand, in the large intestine, the trypsin activity of SPF mice was significantly reduced compared to GF mice, suggesting that microbiota play an important role in trypsin regulation in the large intestine.
Example 2
(identification of a strain of Paraprevatela capable of promoting trypsin degradation)
Although the large intestine microbiota has been previously reported to inactivate trypsin, bacteria involved in this process have not been identified. Attempts have therefore been made to isolate and identify trypsin-reducing bacteria from the human microbiota.
First, stool samples collected from 6 healthy Japanese donors (donors A-F) were administered to GF mice. Trypsin activity in GF mice feces was then measured. FIG. 6 is a graph showing the measurement of trypsin activity in the stool of GF mice administered with stool samples collected from each healthy human donor (A-F). In fig. 6, "n.s." means no significant difference, ", means significant difference at P < 0.001.
The results show that the human fecal samples examined differ in their ability to reduce the trypsin activity of the mouse feces. Specifically, the fecal trypsin activity of mice treated with the microbial flora of donor B was not reduced, whereas the fecal trypsin activity of mice treated with the microbial flora of donors C, D, E and F was significantly reduced.
Mice receiving donor C microbiota (mouse c#5) were then selected and their cecal content was collected. In addition, cecal content was orally administered to new GF mice (gf+c5 mice). Here, in order to perfect the microbiota, gf+c5 mice were divided into four groups, each group was administered ampicillin (Amp), metronidazole (MNZ), tylosin (Tyl) or a control group (No antibiotic, no Abx) by drinking water, and trypsin activity in each group of feces was measured over a period of time. The results of the study are as follows.
FIG. 7 is a graph showing the results of measurement of trypsin activity in the feces of each group of mice. In fig. 7, "n.s." means no significant difference, ", means significant difference at P < 0.001. Thus, fecal trypsin activity of gf+c5 mice not treated with antibiotics decreased within a few days. Notably, the fecal trypsin activity was reduced more in the Amp-treated group, while the fecal trypsin activity was not reduced in either the MNZ-treated group or the Tyl-treated group. These results indicate that the microbiota of C #5 contains trypsin-reducing bacteria that are enriched in Amp-treated groups and reduced in MNZ or Tyl-treated groups.
One of the Amp-treated mice (C5-Amp #5 mice) was followed, and its cecal content was collected and cultured in vitro with various media under anaerobic conditions. 432 different colonies were then isolated and subjected to 16S rRNA gene sequencing, which was subdivided into 35 unique strains. These 35 strains cover bacterial species that colonize extensively in C5-amp#5 mice; when a mixture of 35 isolated strains (35-mix) was administered to GF mice (gf+35-mix), trypsin activity in the feces was significantly reduced, and this reduction was comparable to that of mice colonized with the original donor C microbiota.
Next, in order to narrow the range of effector bacteria, the 16S rRNA gene sequences of fecal samples obtained in the above-described antibiotic administration study were analyzed, and Spearman rank correlation test was performed to evaluate the correlation between the relative abundance of each of 35 strains and the decrease in trypsin activity. The results showed that 1 of 35 strains was inversely related to trypsin activity (ρ. Ltoreq. -0.3).
The effects of these 14 bacteria and the remaining 21 bacteria were then compared by preparing Sydney mice. The results showed that trypsin activity in the feces of GF+14-mix mice was significantly reduced, comparable to GF+35-mix mice, while trypsin activity in the feces of GF+21-mix mice was not reduced. From these 14-mix, 9 strains (p.ltoreq.0.5, p.ltoreq.0.05) were subsequently further selected which were markedly associated with a reduction in trypsin activity.
The colonization of 9-mix in GF mice showed a significant decrease in trypsin activity in feces, comparable to the colonization of 14-mix mice. Finally, 9-mix is classified into 3-mix including Bacteroides species and 6-mix including non-Bacteroides species.
3-mix consisting of Paraprevotella clara (P.clara, strain number: 1C 4), bacteroides uniformis (B.uniformis, strain number: 3H 3) and Parabacteroides merdae (P.merdae, strain number: 1D 4) was sufficient to reduce fecal trypsin activity in vivo, whereas 6 non-bacteroidal mixtures were found to be completely ineffective in reducing fecal trypsin activity.
To identify trypsin-degrading bacteria, each strain in 9-mix was incubated with recombinant mouse trypsin (rmPRSS 2, added with C-terminal His tag) and trypsin degradation was measured by Western Blot.
FIG. 8 is a photograph showing the result of Western Blot. The results showed that only P.clara (strain number: 1C 4) was able to degrade trypsin. In agreement therewith, the mixture of B.unitoris 3H3 and P.merdae 1D4 (3-mix excluding P.clara (1C 4)) or 34-mix did not show the ability to reduce fecal trypsin activity in vivo.
Based on the above results, contrary to the original assumption that the community of microbiota plays a key role in the decrease of trypsin activity, we conclude that a single p.clara (1C 4) is necessary for the decrease of trypsin activity.
Recombinant human trypsin isomers PRSS1, PRSS2 and PRSS3 (rhPRSS 1-3) were then mixed and incubated with P.clara (1C 4), and then analyzed for human trypsin degradation by Western Blot.
FIG. 9 is a photograph showing the result of Western Blot. In fig. 9, "-" indicates that p.clara was not added, and "+" indicates that p.clara was added. The results show that p.clara can promote the reduction of three known human trypsin (PRSS 1 and PRSS2, and to a lesser extent PRSS 3) in addition to mouse trypsin.
It was then investigated whether the effect of p.clara on trypsin activity was strain specific. Specifically, recombinant mouse PRSS2 (rmPRSS 2) was hatched in vitro with Paraprefortella strain or Prevoltellella strain, and the degradation of rmPRSS2 was analyzed by Western Blot. The Paraprresolvotela strains used were P.clara (JCM 14859, P237E3B, P322B 5) and P.xylanipila (JCM 14860, 82A 6). The Prevoltella strains used were Paraprevotella copri, paraprevotella denticola, paraprevotella stercorea, prevotella oulorum. FIG. 10 is a photograph showing the result of Western Blot. In fig. 10, "×" indicates the position of the cleaved fragment of PRSS2 detected by the anti-PRSS 2 antibody.
Paraprefotella is a recently discovered genus of Prevoltellaceae, comprising only two species, namely P.clara and Paraprevotella xylaniphila. Thus, the P.xylaniella (JCM 14860) strain and the P.xylaniella (82A 6) strain isolated from human healthy stool samples were studied.
The results show that the other three p.clara strains (JCM 14859, P237E3B and P322B 5) isolated from fecal samples of healthy people (including non-japanese donors) have a common trypsin reduction. It was also found that both the P.xylanipila (JCM 14860) strain and the P.xylanipila (82A 6) strain exhibited strong trypsin reducing ability similar to that of P.clara.
On the other hand, bacteria of the genus Prevotella (Prevotella copri, prevotella denticola, prevotella stercorea and Prevotella oulorum) which are phylogenetically similar to the genus Paraprefotella are unable to reduce trypsin activity.
These results indicate that Parapreviella is a representative component of the trypsin-degrading human microbiota.
Example 3
(type IX secretion mechanism dependent autolysis of trypsin by Polysacchoride binding molecules)
The substrate specificity of p.clara was studied. Specifically, GF mice cecal content containing large amounts of trypsin and various proteins were incubated in vitro with p.clara (1C 4). Subsequently, the abundance of each protein was studied by peptide group analysis of LC-MS. The results showed that of the 7614 peptides from 276 proteins, trypsin was the only one that showed a pattern of significant increases in peptide concentration over time in the presence of p.clara.
These results indicate that p.clara has a narrower substrate specificity and higher activity towards trypsin. In addition, the rate of p.clara degrading trypsin was studied and found to occur gradually and proportionally.
Furthermore, P.clara mediated trypsin degradation occurs only in the presence of a sufficient number of divalent cations (e.g., ca2+). Thus, this degradation is obviously mediated by enzymes (proteases). However, when trypsin was incubated with p.clara culture supernatant, no such degradation was observed. Furthermore, no activity of protease degradation was detected when the p.clara live bacteria or the filtered p.clara supernatant was incubated with protease substrate (fluorescein-labeled casein).
FIG. 11 is a photograph showing the results of incubating protease inhibitor-pretreated recombinant mouse PRSS2 (rmPRSS 2) in combination with P.clara (1C 4) and analyzing rmPRSS2 degradation using Western Blot. The results show that pretreatment with the serine protease inhibitor AEBSF or Leupeptin and the specific trypsin inhibitor TLCK inhibited the trypsin degradation capacity of P.clara. This result suggests that p.clara effects degradation of trypsin by mediating trypsin-dependent autolysis.
Subsequently, trypsin was labeled with Alexa Fluor 488 to observe the interaction between trypsin and p.clara, in order to elucidate the mechanism by which p.clara promotes trypsin autolysis. FIG. 12 is a photograph showing the results of incubating Alexa Fluor 488-labeled rmPRSS2 with P.clara (1C 4), P.dentritica and P.olorum, respectively, and observing the binding of rmPRSS2 to bacterial surfaces with a confocal microscope. In fig. 12, a black square shows an enlarged image of p.clara (1C 4).
The results show that the fluorescently labeled trypsin was able to aggregate to the surface of p.clara in a few minutes. In contrast, no trypsin aggregation was observed in p.dentate and p.olorum. This suggests that degradation of trypsin occurs at the surface of p.clara and that trypsin-bound surface molecules promote aggregation and autolysis of p.clara trypsin.
Next, to identify trypsin-bound surface molecules of p.clara, trypsin (His-tag rmPRSS 2) and p.clara-derived molecules were crosslinked to form complexes using disuccinimide sulfone (DSSO) as a chemical crosslinking agent. The crosslinked complexes of rmPRSS2 and p.clara source molecules were then analyzed by Western Blot.
The results show that DSSO treatment resulted in a new band of high molecular weight (about 250 kDa) recognized by anti-His-tag antibodies. The band indicates the presence of a high molecular weight complex containing trypsin.
Although the mass spectrometer at hand was not sensitive enough to detect the cross-linked peptide fragment of the complex, the band around 250kDa was smeared, indicating that trypsin is interacting with a heterogeneous molecule.
It is well known that the cell surface of Bacteroides (including Paraprevitelella) has modified glycan complexes. This suggests that sugar-containing molecules on the p.clara surface are involved in trypsin binding and degradation.
To investigate this hypothesis, experiments were performed using inhibitors against p.clara saccharide synthesis. First, p.clara is treated with tunicamycin, an inhibitor against the WecA class of transferases, which mediate the first step of bacterial Lipopolysaccharide (LPS) O-glycan formation.
FIG. 13 is a photograph showing the results of incubating rmPRSS2 with P.clara (1C 4) pretreated with tunicamycin (Tuni.) or a control and analyzing rmPRSS2 degradation by Western Blot. The results showed no degradation of trypsin in clara treated with tunicamycin.
FIG. 14 is a photograph showing the results of incubating P.clara (1C 4) treated or not with tunicamycin with Alexa Fluor 488-labeled rmPRSS2, respectively, and observing the binding of rmPRSS2 to bacterial surfaces with confocal microscopy. In FIG. 14, "Tunicamycin (+)" represents the results of treatment with Tunicamycin and "Tunicamycin (-)" represents the results of treatment without Tunicamycin. The results showed that in the tunicamycin treated p.clara, trypsin did not aggregate at the surface of p.clara. Also, when p.clara is treated with the glycosyltransferase inhibitor 2-fluoro-L-fucose (2F-Fuc), which widely inhibits glycosyl-rich synthesis, both surface binding of p.clara to trypsin and degradation of trypsin are inhibited.
T9SS (type IX secretion mechanism) is a bacterial mechanism that works in conjunction with the Sec system to transport proteins with conserved C-terminal domains across the outer membrane to the surface. T9SS has a transpeptidase-like protease activity, can remove the C-terminal domain, and can bind the extracellular transported protein to surface polysaccharides.
The inventors speculate that cell surface proteins secreted by T9SS may be responsible for the recruitment and degradation of trypsin, and conducted the following study. First, the gene sequences that were supposed to be included in T9SS were identified in the genomes of P.clara and P.xylanishila.
FIG. 15 is a schematic diagram showing the arrangement of the T9SS gene composition in the P.clara (JCM 14859), P.xylanipila (JCM 14860) and P.gingivalis (ATCC 33277) genomes.
A mutant P.clara (JCM 14859) strain was then prepared by homologous recombination of the plasmid sequences, which had deleted expression of the T9SS essential factor PorU. FIG. 16 is a schematic diagram showing homologous recombination absent PorU expression.
FIG. 17 is a photograph showing the result of incubating mutation P.clara (JCM 14859) with rmPRSS2 and analyzing the degradation of rmPRSS2 by Western Blot. In FIG. 17, "PorU variant" represents the result of the mutation P.clara (JCM 14859), and "WT" represents the result of the wild-type P.clara (JCM 14859). The results showed that the mutation p.clara (JCM 14859) was completely absent for trypsin degradation, indicating that T9 SS-dependent surface proteins are involved in trypsin degradation.
The p.clara culture supernatants were then proteomic analyzed in the presence or absence of tunicamycin to identify cell surface proteins with trypsin degradation. As a result, 20 bacteria-derived proteins were found in the P.clara culture supernatant treated with tunicamycin.
Thus, by introducing a plasmid sequence into each of the 20 target loci, or by removing the gene cluster (Δ03048-03053), a series of mutant p.clara strains inhibiting synthesis of the tunicamycin-sensitive protein were prepared.
FIG. 18 is a photograph showing the results of Western Blot analysis of rmPRSS2 degradation mediated by wild-type (WT) P.clara (JCM 14859) or a series of mutants P.clara (JCM 14859). In fig. 18, "Δ03048-03053" represents the results of removing mutant p.clara of gene clusters, "00029", "00890", "00822", "00342", "00104", "03191", "00502", "02199", "00472", "01041", "00823", "00729", "00935", "01686", "03166", "00509", "00002", "PorU" and "WecA" represent the results of deleting mutant p.clara of the described genes, respectively, after insertion into a plasmid.
The results showed that when the gene encoding 00502 protein (UniProtKB ID: G5SNC9, omp 28-associated extracellular protein) or 00509 protein (UniProtKB ID: G5SNC1, unknown protein) was disrupted, trypsin degradation in vitro disappeared, as was the case for mutants lacking PorU or WecA (target factor for tunicamycin).
Next, the same study was performed by preparing p.clara strain (Δ00502 and Δ00509) lacking 00502 protein or 00509 protein instead of the insertion mutant.
FIG. 19 is a photograph showing the results of incubating Δ00502 and Δ00509 with Alexa Fluor 488-labeled rmPRSS2, respectively, and observing the binding of rmPRSS2 to the bacterial surface with a confocal microscope. The results showed that there was no recruitment of trypsin in both the Δ00502 strain and the Δ00509 strain.
FIG. 20 is a photograph showing the results of incubating the strain Δ00502 and the strain Δ00509 with rmPRSS2, respectively, and analyzing the degradation of rmPRSS2 by Western Blot. In fig. 20, "WT" represents the result of wild-type p.clara (JCM 14859). The results showed no degradation of trypsin in both the Δ00502 strain and the Δ00509 strain.
The genomic sequence of the Paraprevatella strain was then analyzed. FIG. 21 is a schematic diagram showing the genomic sequence of Paraprevatella strain. The results showed that all strains with trypsin degradation ability possess 00502 gene and 00509 gene. However, none of the species of Prevotella investigated in the study had homologs of 00502 gene and 00509 gene.
In addition, the 00503 to 00508 genes are conserved among the Paraprevatella strains, the loci of which are far apart from those of the 00502 gene and 00509 gene. FIG. 22 is a photograph showing the result of incubating a P.clara mutant strain lacking 00502 to 00509 genes with rmPRSS2 and analyzing the degradation of rmPRSS2 by Western Blot. The results showed that p.clara mutants deleted from 00503 to 00508 gene retained trypsin degradation activity, excluding their possibility of participating in these trypsin degradation processes.
Example 4
(study of 00502 protein and 00509 protein)
Recombinant 00502 protein and 00509 protein were prepared. The expression vector of 00502 protein or 00509 protein was introduced into E.coli, and treated with isopropyl-beta-thiogalactoside (IPTG) to induce recombinant protein expression, and then the expressed recombinant 00502 or 00509 protein was purified from the cell lysate using magnetic agarose beads.
FIG. 23 is a graph showing the results of measuring the protease activity of recombinant 00502 protein or 00509 protein by cleaving FITC-labeled casein. Trypsin (1 ng/. Mu.L) was used as positive control. In FIG. 23, (-) indicates that no protein was added. Protease activity is expressed as a change in relative fluorescent units. The results showed that neither the recombinant 00502 protein nor 00509 protein had protease activity.
FIG. 24 is a photograph showing the result of incubating free recombinant 00502 protein and 00509 protein or recombinant 00502 protein and 00509 protein bound to microbeads with rmPRSS2 and degrading rmPRSS2 by Western Blot analysis.
FIG. 25 is a photograph showing the results of incubating microbead-bound recombinant 00502 protein, recombinant 00509 protein, and Bovine Serum Albumin (BSA) with rmPRSS2, respectively, and observing the binding of rmPRSS2 to protein-bound microbeads with a confocal microscope. The scale bar is 5 μm. BSA was used as a negative control.
The results showed that neither the free 00502 protein nor 00509 protein showed trypsin degradation activity. In contrast, recombinant 00502 protein bound to microbeads showed efficient recruitment and degradation of trypsin.
FIG. 26 is a photograph showing the results of in vitro degradation of trypsin by Western Blot analysis after incubation of GF mouse cecal content with recombinant 00502 protein conjugated to medium control (-) or microbeads for 24 or 48 hours. In fig. 26, "×" indicates the position of the cleaved fragment of PRSS2 detected by the anti-PRSS 2 antibody. Thus, bead-bound recombinant 00502 protein detected significant trypsin degradation.
These results support the model proposed by the inventors, namely 00502 protein as a scaffold for trypsin binding and promote trypsin autolysis. In addition, trypsin binds efficiently to recombinant 00509 protein bound to the microbeads, but degradation of trypsin does not occur.
These results indicate that 00502 protein is the main component that exerts trypsin recruitment and autolysis, while 00509 protein acts to assist trypsin.
FIG. 27 is a graph showing the model of trypsin degradation mediated by Paraprefotella. In FIG. 27, "Sec" represents a Sec system in which proteins are excreted through the cytoplasmic membrane.
As shown in FIG. 27, 00502 and 00509 proteins were transported across Paraprevatella cell outer membrane by a type IX secretion mechanism (T9 SS). PorU is a basic building block for T9SS, which degrades the C-terminal domain (CTD) of T9 SS-dependent proteins, linking LPS molecules of ParaPrevoltella bacteria and these proteins. WecA mediates an initial step in LPS O-glycan synthesis, e.g., disrupting the function of WecA (e.g., by tunicamycin treatment) results in the release of T9 SS-dependent proteins. 00502 protein is involved in trypsin recruitment and autolysis as a major effector. Whereas 00509 protein aids in trypsin recruitment.
The exact mechanism by which the 00502 protein, but not 00509 protein, promotes autolysis of trypsin is still unclear, but it is speculated that binding of trypsin to 00502 protein may alter the structure of trypsin, making it more susceptible to autolysis.
Example 5
(colonisation P.clara helps to maintain IgA)
To confirm the contribution of 00502 and 00509 proteins to trypsin degradation in vivo, GF mice colonized with one of three p.clara (JCM 14859) strains (wild-type (WT), Δ00502 or Δ00509) were analyzed.
Since a single P.clara strain was unable to colonize mice, the P.clara strain was inoculated with two strains that were not trypsin-degrading (2-mix: bacteroides uniformis H3 and Parabacteroides merdae 1D 4). All three p.clara strains were effectively colonized in the intestinal tract of mice by inoculation with 2-mix.
FIG. 28 is a graph showing the results of DNA quantification of P.clara strain in feces of GF mice inoculated with P.clara strain (wild type (WT), Δ00502 or Δ00509) and 2-mix. In fig. 28, "n.s." means no significant difference. The results showed that in vivo, the deletion of 00502 gene or 00509 gene did not affect the number of p.clara strains.
FIG. 29 is a graph showing measurement results of trypsin activity in feces of GF mice inoculated with the P.clara strain (wild type (WT), Δ00502 or Δ00509) and 2-mix. In fig. 29, "and" "indicate significant differences between P <0.05 and P <0.001, respectively. The results showed that, in agreement with the in vitro results, mice colonized with the p.clara strain of Δ00502 maintained higher trypsin activity in the faeces, whereas mice colonized with the p.clara strain of Δ00509 showed a partial decrease in trypsin activity.
Subsequently, the importance of Δ00502 was investigated in the presence of more complex microbiota. FIG. 30 is a graph showing the results of measurement of trypsin activity in feces of GF mice inoculated with the P.clara strain (wild type (WT), Δ00502) and the 34-mix described above (P.clara (1C 4) excluded from 35-mix). In fig. 30, "×" indicates that there is a significant difference at P < 0.01. The results showed that mice colonized with the p.clara strain of Δ00502 showed higher trypsin activity in faeces than mice colonized with the wild-type (WT) p.clara strain.
These results demonstrate that the basic role of 00502 protein is to promote trypsin degradation in vivo.
The effect of Wild Type (WT) and mutant p.clara on the regulation of intestinal trypsin levels on important intestinal defenses factors such as IgA and antibacterial peptides were subsequently studied.
FIG. 31 is a photograph showing the results obtained by Western Blot analysis of trypsin (PRSS 2), type II transmembrane serine protease (TMPRSS 2), igA heavy chain, kappa light chain, gram negative antibacterial peptide Reg 3. Beta. And Chymotrypsin Like Elastase B (CELA 3B) in the feces of GF mice inoculated with P.clara strain (wild type (WT), Δ00502 or Δ00509) and 2-mix.
According to the results, more IgA heavy chains (alpha chains) were detected in the feces of mice colonized with wild-type p.clara strains compared to mice colonized with p.clara strains of Δ00502 or Δ00509. In contrast, kappa light chains are resistant to trypsin and therefore the kappa light chain levels in the mouse feces of the various p.clara strains were similar. In addition, the gram-negative antibacterial peptide Reg3 beta is similar to kappa light chain and also resistant to trypsin.
These results indicate that colonization of the p.clara strain protects IgA from trypsin degradation in vivo.
FIG. 32 is a photograph showing the results obtained by incubating feces of GF mice, feces of GF mice inoculated with the wild-type P.clara strain and 2-mix, a mixed sample of the two, and a mixed sample of the two plus trypsin inhibitor (TCLK) at 37℃for 48 hours, and then analyzing trypsin (PRSS 2), igA heavy chain, kappa light chain, and Chymotrypsin Like Elastase B (CELA 3B) with Western Blot.
The results in lane 1 show that trypsin was present in the feces of GF mice, indicating that the IgA heavy chain was degraded. The results in lane 2 show that trypsin is reduced in the feces of GF mice vaccinated with the wild-type P.clara strain and 2-mix, and IgA heavy chain is still present. The results in lane 3 also show that residual trypsin in GF mouse faeces degrades the IgA heavy chain remaining in GF mouse faeces vaccinated with wild type p.clara strain and 2-mix. The results in lane 4 also show that the addition of trypsin inhibitor can inhibit degradation of IgA heavy chains in lane 3. The results in lanes 3 and 4 also indicate that IgA heavy chains are more susceptible to trypsin degradation than kappa light chains.
Example 6
(study of the effect of colonization of the P.clara Strain on infection with an enteric pathogen)
It was investigated whether trypsin degradation mediated by the p.clara strain can maintain IgA levels under conditions of enteric pathogen infection.
GF mice were inoculated with 2-mix or with 2-mix in combination with wild-type (WT) or with p.clara strain Δ00502 and infected with mouse pathogen Citrobacter rodentium (c.tridentatum) 14 days after inoculation, c.tridentatum mainly infecting the large intestine.
Figure 33 shows the change in body weight of each group of mice after c. In fig. 33, ", and", respectively, indicate that there is a significant difference (2-mix+Δ00502vs.2-mix) between p < 0.01 and p < 0.001. "#" also indicates a significant difference at P <0.05 (2-mix+WT vs.2-mix).
The upper half of fig. 34 is an image showing the large intestine of each group of mice 7 days after infection with c. In the 2-mix group, the cecum color whitened and contracted. The bottom half of fig. 34 is a representative micrograph showing hematoxylin-eosin staining of cecal tissue.
Fig. 35 is a graph showing histological scores of cecal tissue based on hematoxylin-eosin staining. In fig. 35, "n.s." means no significant difference, ", means significant difference at P < 0.001.
The results show that the 2-mix group showed a rapid weight loss and severe cecal inflammation after infection with c. In contrast, mice colonized with p.clara showed milder weight loss and milder cecal inflammation. This is surprisingly similar in both the 2-mix+WT and 2-mix+Δ00502 groups. These results indicate that the p.clara strain protects mice from c.tridentatum infection by a mechanism independent of trypsin degradation.
FIG. 36 is a photograph showing the results of analyzing total IgA, C.condensing specific IgA and Chymotrypsin Like Elastase B (CELA 3B) in feces by Western Blot.
The results show that in the case of similar cecal inflammation levels, mice in the 2-mix+wt group maintained higher total IgA levels, starting 7 days earlier after c.rodenium infection, compared to the 2-mix+Δ00502 group, which was found to have considerable c.rodenium-specific IgA.
FIG. 37 is a photograph showing the results of evaluating fecal IgA aggregation effect by in vitro incubation of living C.rodenium and fecal fluid. The results show that incubating the fecal microbiota of the 2-mix+wt group with live c. This result indicates the presence of C.condensing specific IgA.
In summary, the p.clara strain, in addition to having a defensive effect on c.rodenium infection through an unknown mechanism, enhances the adaptive immune response to pathogens by protecting pathogen-specific IgA.
Example 7
(enhancing the efficacy of the vaccine against C.condensing infection by trypsin degradation of the P.clara strain)
The p.clara strain mediated trypsin degradation and the consequent pathogen-specific IgA protection may enhance the efficacy of oral vaccines against enteric pathogens and provide greater resistance when re-exposed to the same pathogen.
To confirm this hypothesis, we conducted the following study. Fig. 38 is a schematic view showing the schedule of the present embodiment. GF mice colonized with WT or Δ00502 p.clara strain were orally vaccinated with acetic acid inactivated c.tridentium vaccine 1 time per week for 3 weeks on a 2-mix basis. Oral infection was then performed with c.
Fig. 39 is a graph showing the change in body weight of mice after c. In fig. 39, "×" indicates that there is a significant difference at P < 0.05. The results showed that mice pre-inoculated with the wild type p.clara strain had a lighter weight loss.
The cecal plaque and lumen contents were then collected from mice 14 days after c.rodenium infection, and CFU of c.rodenium was measured. Figure 40 is a graph showing CFU measurements of c. Fig. 41 is a graph showing CFU measurements of c.
The results show that although the amount of C.rodents present in the cecum was similar in the 2-mix+WT and 2-mix+Δ00502 groups, there were more C.rodents present in the cecum tissue in the 2-mix+WT groups. In the 2-mix+wt group, infiltration of cecum tissue by c.condensing is inhibited.
FIG. 42 is a photograph showing the results of analysis of trypsin (PRSS 2), total IgA, C.condensing specific IgA and Chymotrypsin Like Elastase B (CELA 3B) in the lumen contents of the blind intestine by Western Blot.
The results show that significantly higher total IgA levels and significantly higher C.rodentum-specific IgA levels were detected in the cecum of mice colonized with the wild-type (WT) P.clara strain. From these results, it can be speculated that the vaccine is very effective in preventing invasion of C.rodenium into the 2-mix+WT group.
These results support the inventors' opinion that the delivery of the p.clara strain allows the host to more effectively cope with previously encountered enteropathogens.
Example 8
(P.clara strain protects mice from coronavirus infection)
Trypsin and trypsin-like proteases such as type II transmembrane serine protease (TMPRSS 2) are known to be involved in the proteolytic activation of coronavirus spike proteins and fusion of viral and host cell membranes.
TMPRSS2 is expressed as a transmembrane protein in lung and intestinal epithelial cells, but can undergo self-cleavage to release its protease domain. Interestingly, the inventors found in example 5 above that the content of TMPRSS2 was reduced in the feces of GF mice colonized with the wild type p.clara strain. This suggests that the p.clara strain has an effect similar to the release of TMPRSS2 active form in vivo. This suggests that in the gut, episomal TMPRSS2 may promote coronavirus infection along with trypsin.
The possibility that the p.clara strain may protect the gut from coronavirus infection by degradation of trypsin and TMPRSS2 was investigated. Specifically, the effect of colonization of the p.clara strain on infection with the Mouse Hepatitis Virus (MHV) (Maustropic coronavirus) was studied.
Fig. 43 is a schematic diagram showing an outline of an MHV infection experiment. GF mice were inoculated with WT or Δ00502 p.clara (JCM 14859) strain, simultaneously with 2-mix, and after 14 days were subjected to oral infection with MHV.
FIG. 44 is a graph showing survival curves of mice after MHV infection. The results show that colonization of the wild-type p.clara strain prolonged survival of mice after MHV lethal infection.
FIG. 45 is a graph showing the results of measuring MHV virus titers in the liver, brain and stool. In fig. 45, ", and", respectively, indicate that there is a significant difference between p <0.05, p <0.01, and p < 0.01.
The results showed that colonization of the wild-type p.clara strain successfully defended against viral transmission in the liver and brain. In addition, the number of viral particles excreted in the feces of mice from group 2-mix+WT was significantly lower than that of mice from group 2-mix+Δ00502.
Fig. 46 is a representative image showing hematoxylin-eosin staining results of liver tissue of each group of mice. Histological analysis showed that mice colonized with the wild-type p.clara strain were protected from liver necrosis lesions caused by MHV.
These results indicate that the colonization of the p.clara strain protects the host during coronavirus infection.
Example 9
(detection of Paraprevatellella and related genes in the human microbiota)
First, about 600 ten thousand non-redundant complete genes were collected from a six-region cohort, a new enterobacteria gene list was established, and abundance and prevalence of the paraprevatela, trypsin-related 00502 gene, and 00509 gene were analyzed.
Homology searches were performed on 5,929,528 genes in the non-redundant enterobacteriaceae gene catalog using USEARCH ublast (protein level), and the results were clustered on a match with a minimum e-value of 0.1.
FIG. 47 is a diagram showing computer-retrieved homologs of genes related to trypsin degradation (00502 gene and 00509 gene) and species encoded thereby.
FIG. 48 is a graph showing the locus structure of genes involved in trypsin degradation and sequence identity (%) with respect to the P.clara strain gene in each bacterial species.
The results showed that the 00502 gene of the P.clara strain and the homolog of the 00502 gene of the P.xylanphilia strain were the same or almost the same. Similarly, it was confirmed that the 00509 gene of the P.clara strain and the homolog of the 00509 gene of the P.xylanphilia strain were identical or nearly identical. In addition, it was confirmed that the homologs of 00503 to 00508 genes located between 00502 gene and 00509 gene were identical or nearly identical with the genes of the p.clara strain and the p.xylanphilia strain.
In addition, two metagenomic species (MSP 0303 and MSP 0335) with 00502 to 00509 gene homologs were identified and identified as likely members of Paraprevatella.
Two metagenomic species (MSP 0303 and MSP 0335) annotated as parapreviella encode homologs of all or nearly all 00502 to 00509 genes of the p.clara strain; five MSPs (MSP 0081, MSP 0224, MSP 0288, MSP 0410 and MSP 0435) annotated as bacterioides encode homologs of the 00502 gene and 00509 gene, but lack homologs of the 00503 to 00508 genes.
Prevolella muris is also believed to lack homologs of the 00509 gene.
In metagenomic analysis, the average maximum relative abundance of Paraprevatella was 3%, with abundance varying widely between the queues (7-50% of the samples), with P.clara strain being the most common species, followed by P.xylanigilla strain.
These data indicate that Parapreviella forms an important part of the human microbiota, which may be related to individual differences in susceptibility to infection by enteric pathogens.
Like mice, the concentration of human trypsin is thought to be regulated by bacterial species that degrade trypsin like Paraprevatella. Patients with Inflammatory Bowel Disease (IBD), such as Ulcerative Colitis (UC) and Crohn's Disease (CD), have been reported to have high trypsin concentrations in their feces. Thus, we measured trypsin activity in the feces of non-IBD control group (healthy), ulcerative Colitis (UC) and Crohn's Disease (CD) patients in the japanese cohort.
FIG. 49 is a graph showing the measurement results of trypsin activity. In fig. 49, "×" indicates that there is a significant difference at p < 0.05. The results showed that trypsin activity was higher in faeces of UC and CD patients compared to the non-IBD control group.
Subsequently, we analyzed Paraprevotella carryover rates for the non-IBD control group, UC and CD subgroups of the two IBD queues (PRISM and HMP 2). FIG. 50 is a graph showing Parapreviella carrier rates in patient populations diagnosed with PRISM and HMP2 cohorts as Ulcerative Colitis (UC) and Crohn's Disease (CD). The results show that in both studies, paraprevatela was present in the non-IBD samples at a greater rate than in the UC and CD patients. This trend is also statistically significant in the larger scale HMP2 queues.
These results indicate that the Parapreviella carry-over is related to intestinal health.
FIG. 51 is a photograph showing the results obtained by incubating Prevotella rodentium strain, prevolella Muris strain, wild-type P.clara strain (P.clara (WT) P.clara, P.clara strain (P.clara (00502 KO)) of the 00502 gene knocked out as a negative control) in combination with recombinant mouse trypsin (rmPRSS 2, C-terminal His tag added) and measuring trypsin degradation with Western Blot.
FIG. 52 is a photograph showing the results obtained by incubating a Prevolella ra (MSP 0081) strain, a wild-type P.clara strain (P.clara (WT)) as a positive control, and a P.clara strain (P.clara (00502 KO)) with a recombinant mouse trypsin (rmPRSS 2, C-terminal His tag added) as a negative control in a mixed manner, and measuring trypsin degradation by Western Blot.
[ Industrial availability ]
The present invention provides a technique for controlling protease activity.
SEQUENCE LISTING
<110> national institute of research and development method for human and physical chemistry
Celebration Ying Yishu
<120> composition for degrading trypsin or TMPRSS2
<130> PC-33307
<160> 178
<170> PatentIn version 3.5
<210> 1
<211> 3792
<212> DNA
<213> Paraprevotella clara
<400> 1
atgaaaaaaa aatcttgttt ggtgttcctc tttctattgg gaatactatg gaatgtacaa 60
gtccacgctg acggcgtgag ccaacctacg ttccatacct tcaagttcac agaccaagct 120
acactccaaa acatgtccga caatgggaaa tgggctgtag ctttcggtac gaacggtaac 180
agtgtggaag attttccgaa gctcatagac ttggctacgg acaaggctac cgaacttttg 240
agcgaaagcg gaaccgcatt aggcaacggc gcctacgatg tcaataacga gggaactctc 300
gtagtaggac gatatgaagg caatccggcc gtttggacca aaaccaataa caaatggaca 360
gtacttcctg ttcctacggg atgggacggt ggtttagtta acgccattac tcccgacgga 420
aagtgggcca taggtagggc caccaaaggt cagtatgacg aaactcccgt actgtgggat 480
atgagtaaag gcggtataat cacagaaacg cctaatatcc cggtaaaaga catgaccgga 540
ctagaccagc accaaagccg ctttgtaggc atttccgctg acggacgtta tatcgtaggt 600
tgcctgtctt tcagctatat ccagcctatc gcttgttgtt attacgtcta tgatcgggat 660
aagcaaacat ataaattcat cggattcgat gtagatgaga attacaaatg gactccgttg 720
gctcagaatc tggcctttat cgatgatgca cgtatcagca acaacggaaa gtatataacg 780
ggtactgctt atatggtaaa agaaggaaat gccgaataca aaactccttt tctttatgac 840
gtcgaagccg gttctttctc tatttacgac cagaatcagg atcaaggtct catcggacta 900
tgttgtgata acgagggtca tgtgttcggt gcttcacctt cagacagtcc ggcacgtgaa 960
tggagcatac gcgtaggcca atactggtac agcctaaccc aaatcctgaa acagcattac 1020
aatatagact tcaatacagc ttcggggttt gaaaatagcg gcactccgat tgcccttaat 1080
gtagaaggga ctaaaatcgc cgttttcgtc tataaaggtg aaagttatat actggagatg 1140
cctcaatcgc tcgttgaact gaccaatgaa attgatttat tggccaacta ttccgtcagc 1200
ccggaggaag gttcacaatt ctctgccctg aaatcactaa gcctcacttt tgaccgcgac 1260
attgaagtga caggaaagag cagtgccgta atcttaaaag acgaaaacgg gaacaaagta 1320
acaagttcag taacgttcaa acgtaacgcc aataacagca aaatcgtcga tattagtttc 1380
cgtggcgcaa acctcgaaga cggtaaaaaa tatactgttt ctgtcccggc tggttccatc 1440
gttattaatg gtgatgccga gaaggcctgc aaggaaatta acattagcta caccggacgg 1500
gccaataagc cagtcgcacc tactagcatt gcaccggttg acaactcaac cctatcattg 1560
ttgaacttct ctacgaaccc tatcatcatt aattttgatg caggaatcgc cttgaccgac 1620
acggcttacg ccgcagttta ccgcaataac gaaacagagc cgatgtgcga actgaaaatg 1680
gctgtgtcct ctctcgacag taaacaatta ggcgtgtacc cgtctgccgg acaatatttg 1740
tataagggta atgagtatta cgtaaaaatc aaggcaggat cagtcacgga cgttttaggt 1800
agcaatccga acgaagccat caacctgcac tacacgggta actatgaacg tgaaatcaac 1860
tacgacgatg taacattgtt caaagacgat tttaacaacg gtttgggcca attcctcttc 1920
tatgaaggag ataaacgcga accggtggaa agcatggccc aatggggatt cactgcgact 1980
acaactcctt ggagcatcgt atgggatgag gacaacacca gcgatttggc agcagcttca 2040
cattccatgt actctccggc aggaaagtct gacgactgga tggtaacaac gcaaattttc 2100
attccgtcta accaatgcta tttaagatgg gagtcacaga gttatctcaa gagtaaagga 2160
gaccgtttga aaataatggt ttgggaatac gaccccgtat tgaatgctct gaacgacgat 2220
cttatcgcta aattcaagaa tgaaggaaag gttatttacg acgaatttga aaaacctggt 2280
gaagacgaaa acaaattagc aggagaatgg acttcccaca tcgtaaaact ggaagagttt 2340
aaaggcaaaa acgtatatat tgctttcgtg aacgagaacg aagatcaaag tgccatcttt 2400
atagataatg tggaagtaac caacgaccag aagtttttgg tgggtttgac caacgaaacc 2460
tccgtggtaa accaaaaaga aattaaaatc agtggccgta tcagtatcaa tgctttggaa 2520
gatacttatc agagcgtaca cattatcatg aaagacgcta atggtaacgc catagatgaa 2580
atcagcgaat ctggtctctc tttaaagaat ggtgacaagt atgactttgc tttccaaaaa 2640
gctctcccgc taagcgtagg catcgcaaac aagttcacat tggatatcac tttagatgat 2700
gaagaaaaaa caacaggata ttctatcaag aatctggctt tcgcaccgac caagcgtttg 2760
gtgatagaag aatttacagg tacggattgc ccgaactgtc cgctcggaat tctggctttg 2820
ggcaacatgg agaaaatgtt tggcgaccaa atcatcccca tggctatcca cacttacgat 2880
ggcgacatct actcgaccaa agaacttgaa gagtattccg cgttcctgaa cttcagtggt 2940
gctcccagtg gcgtggttaa ccgtcagggc ggagctaccc ccgttccttc ttatccaatg 3000
gccagcgtta aaaacacaga aggaaaagtg aattatatat tcacaaacgg ttcggattta 3060
tggttagacc aggccgagaa agaattcaaa gtggctgcag atgccgaact gaacatcacg 3120
gcgaagtatc aggacggcaa aatcgtagtg ccttgtacct ataaatatgc actgaatgcg 3180
acagatctaa acgtaagttt gttcatggct attctggaag ataacttaac ccgttatcaa 3240
tgtaacaact tgggaggtac ttctgacccg aatctcggag aatggggatt gggcggccag 3300
tacgctgcca gtgttgtggc accttacacg ttcaacgatg tagttcgcag catacccagt 3360
gcttactatg gtgtaagcgg tctgattcct tcttccgtaa cagccggtga ggagaataca 3420
actagcctgg acttaaatat cccggaaaat gtcattgacc tcacgaactg taaagttgta 3480
tgtatgatga tcgatgccaa cacgggaaac atcatcaatg cagccagagc tgacatcaat 3540
acagacgatt ataattccat cgaacaaaca aaagcagacg aaagtatctc cgtgaatgct 3600
gacaacaaca ccatccacat taatgccggc acaacggcac aagttacagt gtacagtgta 3660
gacggaagtg ttctcgacca agttgttatc gaaggagaag gcagcctgaa gctgcaaggc 3720
cataagggta tcgtcttggt tgaggtaacg accgagaata ctcgtgttgt gaagaaagta 3780
ttcgtaaaat aa 3792
<210> 2
<211> 3084
<212> DNA
<213> Paraprevotella clara
<400> 2
atgagagaga aactttacac gaaatggaag ggaggcctga ataggctttg ttttcttctt 60
atgtgttttt gttggactac ggttcagtca tgggccgttg gtgaagattt acatttaacc 120
attgaaaatg gtaagacgta tgaatttgaa gctttcaaca gttattatct tacttatgta 180
gccacggcta acggacaatt atcattgtat caaaccggtg gagacttttg tcgtcagtat 240
acagataaca cttttgaaac ggaattgcct tctacccctc agtatgttaa tgagggaaaa 300
cttgtagaag tgaaggttga gtcaggtaag acttattatt tcttgactcg aggtttaagt 360
aaaggagaac tgaccgttac tttcggagaa aaagcaactc cacttgaatt actttcgctc 420
tccaaagaag agggaacgac attgaatctt tctattgata ccctcttagg cttcactttc 480
aatagaatgg taaaagtagg aaattgcaca ttgtcttccg ggtcggtaat cggaaacttg 540
accgcttcga cgcatgatta tggtttcacc gtttccatta aagatgtgtt gtataaatgg 600
ttgaaggagg gtaatgtaaa agccggagat gaagtcgttt tgactgtaac tggattatgt 660
aatgccaatg acgagagcga caaatacaat ggtaatggcg tgctgacggt gaaatatatt 720
gcaggagcgt tgcctgccga gctggttagc gttgcgaact cacccgaaga gatgaatttc 780
ctgtcttatt acctgcctac tgatgaaagg gggttggtta ctttgacttt caaccgttcg 840
atgggcgaag atgctacggc aaaattattc tatggcgata tagaaggttc aaatgtctat 900
acggaaagtc ttccggtgaa ggtgaaagat actcagttga ttgtagacct ccgtggaaaa 960
cgtcgtttgc catctgatat gttaccgaat gccagtgctg atgtgtggga acagtataaa 1020
accatttcct taaaggttag caatgtgagg gacgcggaag gaaattatgc catgtctacc 1080
tctcagggta ccacaggttc gtatgctttt aattatacaa tagaagctgt tgagtttgat 1140
atcgctcgtc agtttacacc agatgaaggc acatctttgg atgattatcc gacaatagaa 1200
ctttggattc gggaagacga ttcgtttact tatgagggtg tagatttcac ttatagcgtg 1260
aatggaaaac ctgagacggt atttgtaccc atcagccaga ttacgaaaaa agactctcca 1320
gacggagatg gtgtactatt gacgattccg gtgccgaata aagcggcgga tgagaattca 1380
gatgtcgtag tttctttgaa taatttgaaa gcggccgatg gtgcggatta ttcagaagat 1440
tttacgataa tatataaaac gtcaggaaag agtatggcag ggcttgagat agaatctgtg 1500
tctcctgctg atggagcagc tatcgcggct ttagaagctg gttcgtttct ggaattgtct 1560
accaatatga atgatcgtgt aggatatgta tggtttagaa tagatgacca gaacccgaag 1620
gatcctgatc aggcttgtgt taagacgatg acatcgatga agaaagagac ggttgaggga 1680
aaagtttctt tccggacaga aataattcgt agtgtgactt tctttgaagg gcatacttat 1740
aaagttactt tcaatgctta tgcttcggag agcgattatc agcacggtgc agatgctctt 1800
ggtgttatga ctgtgactta cactggtaca actccggaat ttaagtttag tccggttaag 1860
tttgtaggaa taacgccgga tcctgattat actgtaattt cagacgtaag ccagaatgaa 1920
tttacgttga cttttgacgg tcctgtagta ttgaatcatg aaaatgcttt tgtagtatac 1980
ggtcagggaa tgaattatcc atttgaaagt atcgagtcga atgaagacgg tacggaatgg 2040
acggtaaagg catccttgga aaaattaatg gaaatgggta tgatgcctcg attgtctatg 2100
agctttatgc cggtggataa agacggaatg ttagttgaag gtaatgccgg taaagatgcg 2160
acaagttatt tgaattttgc ctatgattgt acgattggta taccagattt gcaagtgagt 2220
ccggaaagcg gttctaaagt ggaaagtttg aaagagatca tagtaggatg taaagatatc 2280
atggacgaca atggtgaggt gatatttcaa ggtggtatca gcgaatctta tatggcggcc 2340
gaaaagatca ttttgtataa gaacggacgt gagccggttg caacggtaac gagtatcgaa 2400
cctattatac cggaggacca agaggataat tacttctatg ttcctgtgga aatgaaatta 2460
acattggata ctgaaattac agataacggt aattatcgtt tgcacattcc tgccaactac 2520
tttatattgg gaacaggtat gtctaatgta aacagtaaag aaacaagtgt gctgtatact 2580
atcgaccagc cgattaagat aacggttacg ccagagaata actctacggt agaaagcctg 2640
aaagaaatta ctatcgaatg tgaatcaggt atagatgtgc cgagtatggg tacaattcaa 2700
ttattgaatg ctcagaatga ggtggtagct tcggcaacgg gtgaggattg cgaattgttg 2760
ttacccgaag gtgccggtcc ttgggatcct tatactggtg tgacaattaa attggatcaa 2820
gaggtgacag agaaaggtac atataagttg gttattcctg aaggattctt ctatttgggt 2880
gaaaactatg aaaattcgga cgaaatgacg tttacttatt cgattaatgc cagtggtatt 2940
cattcgattg gtactgagtc taagggagta gtcgtttata ctgtagacgg taaatttatc 3000
ttaaagtcga gcgatgccaa agacgtgaag aatctgaaga aaggtttgta cattgtgaac 3060
ggtaagaaga tgatggtaaa ataa 3084
<210> 3
<211> 1483
<212> DNA
<213> Paraprevotella clara
<400> 3
gatgaacgct agctacaggc ttaacacatg caagtcgagg ggcagcatgr rcycagcttt 60
gctgggtttg atggcgaccg gcgcacgggt gagtaacgcg tatccaacct gccctttact 120
ccgggatagt ctcctgaaag ggagtttaat accggatgtg tttgtctttc cgcatgggag 180
cgacaaataa agattgattg gtaaaggatg gggatgcgtc ccattagctr gttggcgggg 240
taacggccca ccaaggcrac gatgggtagg ggttctgaga ggaaggtccc ccacattgga 300
actgagacac ggtccaaact cctacgggag gcagcagtga ggaatattgg tcaatgggcg 360
agagcctgaa ccagccaagt agcgtgaagg acgacggccc tacgggttgt aaacttcttt 420
tataagggaa taaagttcgc cacgcgtggt gttttgtatg taccttatga ataagcatcg 480
gctaattccg tgccagcagc cgcggtaata cggaagatgc gagcgttatc cggatttatt 540
gggtttaaag ggagcgtagg cgggctttta agtcagcggt caaatgtcac ggctcaaccg 600
tggccagccg ttgaaactgt aagccttgag tctgcacagg gcacatggaa ttcgtggtgt 660
agcggtgaaa tgcttagata tcacgaagaa ctccgatcgc gaaggcattg tgccggggca 720
gcactgacgc tgaggctcga aagtgcgggt atcaaacagg attagatacc ctggtagtcc 780
gcacggtaaa cgatgaatgc tcgctatggg cgatayawtg tccgtggcca agcgaaagcg 840
ttaagcattc cacctgggga gtacgccggc aacggtgaaa ctcaaaggaa ttgacggggg 900
cccgcacaag cggaggaaca tgtggtttaa ttcgatgata cgcgaggaac cttacccggg 960
cttgaattgc aggtgcatga gtcagagatg attctttcct tcgggactcc tgtgaaggtg 1020
ctgcatggtt gtcgtcagct cgtgccgtga ggtgtcggct taagtgccat aacgagcgca 1080
acccttctcc ccagttgcca tcgggtaatg ccgggccctc tggggacact gccatcgtaa 1140
gatgcgagga aggtggggat gacgtcaaat cagcacggcc cttacgtccg gggctacaca 1200
cgtgttacaa tggggggtac agagggccgc tgtccggtga cggttggcca atccctaaaa 1260
cccctctcag ttcggactgg agtctgcaac ccgactccac gaagctggat tcgctagtaa 1320
tcgcgcatca gccatggcgc ggtgaatacg ttcccgggcc ttgtacacac cgcccgtcaa 1380
gccatgaaag ccgggggtgc ctgaagtccg tgaccgcgag ggtcggccta gggtaaaact 1440
ggtgattggg gctaagtcgt aacaaggtag ccgtaccgga agg 1483
<210> 4
<211> 1482
<212> DNA
<213> Paraprevotella xylaniphila
<400> 4
gatgaacgct agctacaggc ttaacacatg caagtcgagg ggcagcatga gctcagttat 60
tctgggtttg atggcgaccg gcgcacgggt gagtaacgcg tatccaacct gccctttacc 120
cggggatagc cttctgaaag gaagtttaat acccgatgat ttcgtttagt cgcatgactt 180
gatgaataaa gattaattgg taaaggatgg ggatgcgtcc cattagcttg ttggcggggt 240
aacggcccac caaggcgacg atgggtaggg gttctgagag gaaggtcccc cacattggaa 300
ctgagacacg gtccaaactc ctacgggagg cagcagtgag gaatattggt caatgggcgc 360
gagcctgaac cagccaagta gcgtggagga cgacggccct acgggttgta aactcctttt 420
ataaggggat aaagttggcc atgtatggcc atttgcaggt accttatgaa taagcatcgg 480
ctaattccgt gccagcagcc gcggtaatac ggaagatgcg agcgttatcc ggatttattg 540
ggtttaaagg gagcgtaggc gggcagtcaa gtcagcggtc aaatggcgcg gctcaaccgc 600
gttccgccgt tgaaactggc agccttgagt atgcacaggg tacatggaat tcgtggtgta 660
gcggtgaaat gcttagatat cacgaggaac tccgatcgcg caggcattgt accggggcat 720
tactgacgct gaggctcgaa ggtgcgggta tcaaacagga ttagataccc tggtagtccg 780
cacagtaaac gatgaatgcc cgctgtcggc gacatagtgt cggcggccaa gcgaaagcgt 840
taagcattcc acctggggag tacgccggca acggtgaaac tcaaaggaat tgacgggggc 900
ccgcacaagc ggaggaacat gtggtttaat tcgatgatac gcgaggaacc ttacccgggc 960
ttgaatcgca ggtgcatggg ccggagacgg ccctttcctt cgggactcct gcgaaggtgc 1020
tgcatggttg tcgtcagctc gtgccgtgag gtgtcggctt aagtgccata acgagcgcaa 1080
cccccctccc cagttgccac cgggtaatgc cgggcacttt ggggacactg ccaccgcaag 1140
gtgcgaggaa ggtggggatg acgtcaaatc agcacggccc ttacgtccgg ggcgacacac 1200
gtgttacaat ggggggtaca gagggccgct gcccggtgac ggttggccaa tccctaaaac 1260
ccctctcagt tcggactgga gtctgcaacc cgactccacg aagctggatt cgctagtaat 1320
cgcgcatcag ccatggcgcg gtgaatacgt tcccgggcct tgtacacacc gcccgtcaag 1380
ccatgaaagc cgggggtgcc tgaagtccgt gaccgcgagg gtcggcctag ggtaaaaccg 1440
gtgattgggg ctaagtcgta acaaggtagc cgtaccggaa gg 1482
<210> 5
<211> 1263
<212> PRT
<213> Paraprevotella clara
<400> 5
Met Lys Lys Lys Ser Cys Leu Val Phe Leu Phe Leu Leu Gly Ile Leu
1 5 10 15
Trp Asn Val Gln Val His Ala Asp Gly Val Ser Gln Pro Thr Phe His
20 25 30
Thr Phe Lys Phe Thr Asp Gln Ala Thr Leu Gln Asn Met Ser Asp Asn
35 40 45
Gly Lys Trp Ala Val Ala Phe Gly Thr Asn Gly Asn Ser Val Glu Asp
50 55 60
Phe Pro Lys Leu Ile Asp Leu Ala Thr Asp Lys Ala Thr Glu Leu Leu
65 70 75 80
Ser Glu Ser Gly Thr Ala Leu Gly Asn Gly Ala Tyr Asp Val Asn Asn
85 90 95
Glu Gly Thr Leu Val Val Gly Arg Tyr Glu Gly Asn Pro Ala Val Trp
100 105 110
Thr Lys Thr Asn Asn Lys Trp Thr Val Leu Pro Val Pro Thr Gly Trp
115 120 125
Asp Gly Gly Leu Val Asn Ala Ile Thr Pro Asp Gly Lys Trp Ala Ile
130 135 140
Gly Arg Ala Thr Lys Gly Gln Tyr Asp Glu Thr Pro Val Leu Trp Asp
145 150 155 160
Met Ser Lys Gly Gly Ile Ile Thr Glu Thr Pro Asn Ile Pro Val Lys
165 170 175
Asp Met Thr Gly Leu Asp Gln His Gln Ser Arg Phe Val Gly Ile Ser
180 185 190
Ala Asp Gly Arg Tyr Ile Val Gly Cys Leu Ser Phe Ser Tyr Ile Gln
195 200 205
Pro Ile Ala Cys Cys Tyr Tyr Val Tyr Asp Arg Asp Lys Gln Thr Tyr
210 215 220
Lys Phe Ile Gly Phe Asp Val Asp Glu Asn Tyr Lys Trp Thr Pro Leu
225 230 235 240
Ala Gln Asn Leu Ala Phe Ile Asp Asp Ala Arg Ile Ser Asn Asn Gly
245 250 255
Lys Tyr Ile Thr Gly Thr Ala Tyr Met Val Lys Glu Gly Asn Ala Glu
260 265 270
Tyr Lys Thr Pro Phe Leu Tyr Asp Val Glu Ala Gly Ser Phe Ser Ile
275 280 285
Tyr Asp Gln Asn Gln Asp Gln Gly Leu Ile Gly Leu Cys Cys Asp Asn
290 295 300
Glu Gly His Val Phe Gly Ala Ser Pro Ser Asp Ser Pro Ala Arg Glu
305 310 315 320
Trp Ser Ile Arg Val Gly Gln Tyr Trp Tyr Ser Leu Thr Gln Ile Leu
325 330 335
Lys Gln His Tyr Asn Ile Asp Phe Asn Thr Ala Ser Gly Phe Glu Asn
340 345 350
Ser Gly Thr Pro Ile Ala Leu Asn Val Glu Gly Thr Lys Ile Ala Val
355 360 365
Phe Val Tyr Lys Gly Glu Ser Tyr Ile Leu Glu Met Pro Gln Ser Leu
370 375 380
Val Glu Leu Thr Asn Glu Ile Asp Leu Leu Ala Asn Tyr Ser Val Ser
385 390 395 400
Pro Glu Glu Gly Ser Gln Phe Ser Ala Leu Lys Ser Leu Ser Leu Thr
405 410 415
Phe Asp Arg Asp Ile Glu Val Thr Gly Lys Ser Ser Ala Val Ile Leu
420 425 430
Lys Asp Glu Asn Gly Asn Lys Val Thr Ser Ser Val Thr Phe Lys Arg
435 440 445
Asn Ala Asn Asn Ser Lys Ile Val Asp Ile Ser Phe Arg Gly Ala Asn
450 455 460
Leu Glu Asp Gly Lys Lys Tyr Thr Val Ser Val Pro Ala Gly Ser Ile
465 470 475 480
Val Ile Asn Gly Asp Ala Glu Lys Ala Cys Lys Glu Ile Asn Ile Ser
485 490 495
Tyr Thr Gly Arg Ala Asn Lys Pro Val Ala Pro Thr Ser Ile Ala Pro
500 505 510
Val Asp Asn Ser Thr Leu Ser Leu Leu Asn Phe Ser Thr Asn Pro Ile
515 520 525
Ile Ile Asn Phe Asp Ala Gly Ile Ala Leu Thr Asp Thr Ala Tyr Ala
530 535 540
Ala Val Tyr Arg Asn Asn Glu Thr Glu Pro Met Cys Glu Leu Lys Met
545 550 555 560
Ala Val Ser Ser Leu Asp Ser Lys Gln Leu Gly Val Tyr Pro Ser Ala
565 570 575
Gly Gln Tyr Leu Tyr Lys Gly Asn Glu Tyr Tyr Val Lys Ile Lys Ala
580 585 590
Gly Ser Val Thr Asp Val Leu Gly Ser Asn Pro Asn Glu Ala Ile Asn
595 600 605
Leu His Tyr Thr Gly Asn Tyr Glu Arg Glu Ile Asn Tyr Asp Asp Val
610 615 620
Thr Leu Phe Lys Asp Asp Phe Asn Asn Gly Leu Gly Gln Phe Leu Phe
625 630 635 640
Tyr Glu Gly Asp Lys Arg Glu Pro Val Glu Ser Met Ala Gln Trp Gly
645 650 655
Phe Thr Ala Thr Thr Thr Pro Trp Ser Ile Val Trp Asp Glu Asp Asn
660 665 670
Thr Ser Asp Leu Ala Ala Ala Ser His Ser Met Tyr Ser Pro Ala Gly
675 680 685
Lys Ser Asp Asp Trp Met Val Thr Thr Gln Ile Phe Ile Pro Ser Asn
690 695 700
Gln Cys Tyr Leu Arg Trp Glu Ser Gln Ser Tyr Leu Lys Ser Lys Gly
705 710 715 720
Asp Arg Leu Lys Ile Met Val Trp Glu Tyr Asp Pro Val Leu Asn Ala
725 730 735
Leu Asn Asp Asp Leu Ile Ala Lys Phe Lys Asn Glu Gly Lys Val Ile
740 745 750
Tyr Asp Glu Phe Glu Lys Pro Gly Glu Asp Glu Asn Lys Leu Ala Gly
755 760 765
Glu Trp Thr Ser His Ile Val Lys Leu Glu Glu Phe Lys Gly Lys Asn
770 775 780
Val Tyr Ile Ala Phe Val Asn Glu Asn Glu Asp Gln Ser Ala Ile Phe
785 790 795 800
Ile Asp Asn Val Glu Val Thr Asn Asp Gln Lys Phe Leu Val Gly Leu
805 810 815
Thr Asn Glu Thr Ser Val Val Asn Gln Lys Glu Ile Lys Ile Ser Gly
820 825 830
Arg Ile Ser Ile Asn Ala Leu Glu Asp Thr Tyr Gln Ser Val His Ile
835 840 845
Ile Met Lys Asp Ala Asn Gly Asn Ala Ile Asp Glu Ile Ser Glu Ser
850 855 860
Gly Leu Ser Leu Lys Asn Gly Asp Lys Tyr Asp Phe Ala Phe Gln Lys
865 870 875 880
Ala Leu Pro Leu Ser Val Gly Ile Ala Asn Lys Phe Thr Leu Asp Ile
885 890 895
Thr Leu Asp Asp Glu Glu Lys Thr Thr Gly Tyr Ser Ile Lys Asn Leu
900 905 910
Ala Phe Ala Pro Thr Lys Arg Leu Val Ile Glu Glu Phe Thr Gly Thr
915 920 925
Asp Cys Pro Asn Cys Pro Leu Gly Ile Leu Ala Leu Gly Asn Met Glu
930 935 940
Lys Met Phe Gly Asp Gln Ile Ile Pro Met Ala Ile His Thr Tyr Asp
945 950 955 960
Gly Asp Ile Tyr Ser Thr Lys Glu Leu Glu Glu Tyr Ser Ala Phe Leu
965 970 975
Asn Phe Ser Gly Ala Pro Ser Gly Val Val Asn Arg Gln Gly Gly Ala
980 985 990
Thr Pro Val Pro Ser Tyr Pro Met Ala Ser Val Lys Asn Thr Glu Gly
995 1000 1005
Lys Val Asn Tyr Ile Phe Thr Asn Gly Ser Asp Leu Trp Leu Asp
1010 1015 1020
Gln Ala Glu Lys Glu Phe Lys Val Ala Ala Asp Ala Glu Leu Asn
1025 1030 1035
Ile Thr Ala Lys Tyr Gln Asp Gly Lys Ile Val Val Pro Cys Thr
1040 1045 1050
Tyr Lys Tyr Ala Leu Asn Ala Thr Asp Leu Asn Val Ser Leu Phe
1055 1060 1065
Met Ala Ile Leu Glu Asp Asn Leu Thr Arg Tyr Gln Cys Asn Asn
1070 1075 1080
Leu Gly Gly Thr Ser Asp Pro Asn Leu Gly Glu Trp Gly Leu Gly
1085 1090 1095
Gly Gln Tyr Ala Ala Ser Val Val Ala Pro Tyr Thr Phe Asn Asp
1100 1105 1110
Val Val Arg Ser Ile Pro Ser Ala Tyr Tyr Gly Val Ser Gly Leu
1115 1120 1125
Ile Pro Ser Ser Val Thr Ala Gly Glu Glu Asn Thr Thr Ser Leu
1130 1135 1140
Asp Leu Asn Ile Pro Glu Asn Val Ile Asp Leu Thr Asn Cys Lys
1145 1150 1155
Val Val Cys Met Met Ile Asp Ala Asn Thr Gly Asn Ile Ile Asn
1160 1165 1170
Ala Ala Arg Ala Asp Ile Asn Thr Asp Asp Tyr Asn Ser Ile Glu
1175 1180 1185
Gln Thr Lys Ala Asp Glu Ser Ile Ser Val Asn Ala Asp Asn Asn
1190 1195 1200
Thr Ile His Ile Asn Ala Gly Thr Thr Ala Gln Val Thr Val Tyr
1205 1210 1215
Ser Val Asp Gly Ser Val Leu Asp Gln Val Val Ile Glu Gly Glu
1220 1225 1230
Gly Ser Leu Lys Leu Gln Gly His Lys Gly Ile Val Leu Val Glu
1235 1240 1245
Val Thr Thr Glu Asn Thr Arg Val Val Lys Lys Val Phe Val Lys
1250 1255 1260
<210> 6
<211> 1263
<212> PRT
<213> Paraprevotella clara
<400> 6
Met Lys Lys Lys Ser Cys Leu Val Phe Leu Phe Leu Leu Gly Ile Leu
1 5 10 15
Trp Asn Val Gln Val His Ala Asp Gly Val Ser Gln Pro Thr Phe His
20 25 30
Thr Phe Lys Phe Thr Asp Gln Ala Thr Leu Gln Asn Met Ser Asp Asn
35 40 45
Gly Lys Trp Ala Val Ala Phe Gly Thr Asn Gly Asn Ser Val Glu Asp
50 55 60
Phe Pro Lys Leu Ile Asp Leu Ala Thr Asp Lys Ala Thr Glu Leu Leu
65 70 75 80
Ser Glu Ser Gly Thr Ala Leu Gly Asn Gly Ala Tyr Asp Val Asn Asn
85 90 95
Glu Gly Thr Leu Val Val Gly Arg Tyr Glu Gly Asn Pro Ala Val Trp
100 105 110
Thr Lys Thr Asn Asn Lys Trp Thr Val Leu Pro Val Pro Thr Gly Trp
115 120 125
Asp Gly Gly Leu Val Asn Ala Ile Thr Pro Asp Gly Lys Trp Ala Ile
130 135 140
Gly Arg Ala Thr Lys Gly Gln Tyr Asp Glu Thr Pro Val Leu Trp Asp
145 150 155 160
Met Ser Lys Gly Gly Ile Ile Thr Glu Thr Pro Asn Ile Pro Val Lys
165 170 175
Asp Met Thr Gly Leu Asp Gln His Gln Ser Arg Phe Val Gly Ile Ser
180 185 190
Ala Asp Gly Arg Tyr Ile Val Gly Cys Leu Ser Phe Ser Tyr Ile Gln
195 200 205
Pro Ile Ala Cys Cys Tyr Tyr Val Tyr Asp Arg Asp Lys Gln Thr Tyr
210 215 220
Lys Phe Ile Gly Phe Asp Val Asp Glu Asn Tyr Lys Trp Thr Pro Leu
225 230 235 240
Ala Gln Asn Leu Ala Phe Ile Asp Asp Ala Arg Ile Ser Asn Asn Gly
245 250 255
Lys Tyr Ile Thr Gly Thr Ala Tyr Met Val Lys Glu Gly Asn Ala Glu
260 265 270
Tyr Lys Thr Pro Phe Leu Tyr Asp Val Glu Ala Gly Ser Phe Ser Ile
275 280 285
Tyr Asp Gln Asn Gln Asp Gln Gly Leu Ile Gly Leu Cys Cys Asp Asn
290 295 300
Glu Gly His Val Phe Gly Ala Ser Pro Ser Asp Ser Pro Ala Arg Glu
305 310 315 320
Trp Ser Ile Arg Val Gly Gln Tyr Trp Tyr Ser Leu Thr Gln Ile Leu
325 330 335
Lys Gln His Tyr Asn Ile Asp Phe Asn Thr Val Ser Gly Phe Glu Asn
340 345 350
Ser Gly Thr Pro Ile Ala Leu Asn Val Glu Gly Thr Lys Ile Ala Val
355 360 365
Phe Val Tyr Lys Gly Glu Ser Tyr Ile Leu Glu Met Pro Gln Ser Leu
370 375 380
Val Glu Leu Thr Asn Glu Ile Asp Leu Leu Ala Asn Tyr Ser Val Ser
385 390 395 400
Pro Glu Glu Gly Ser Gln Phe Ser Ala Leu Lys Ser Leu Ser Leu Thr
405 410 415
Phe Asp Arg Asp Ile Glu Val Thr Gly Lys Ser Ser Ala Val Ile Leu
420 425 430
Lys Asp Glu Asn Gly Asn Lys Val Thr Ser Ser Val Thr Phe Lys Arg
435 440 445
Asn Ala Asn Asn Ser Lys Ile Val Asp Ile Ser Phe Arg Gly Ala Asn
450 455 460
Leu Glu Asp Gly Lys Lys Tyr Thr Val Ser Val Pro Ala Gly Ser Ile
465 470 475 480
Val Ile Asn Gly Asp Ala Glu Lys Ala Cys Lys Glu Ile Asn Ile Ser
485 490 495
Tyr Thr Gly Arg Ala Asn Lys Pro Val Ala Pro Thr Ser Ile Ala Pro
500 505 510
Val Asp Asn Ser Thr Leu Ser Leu Leu Asn Phe Ser Thr Asn Pro Ile
515 520 525
Ile Ile Asn Phe Asp Ala Gly Ile Ala Leu Thr Asp Thr Ala Tyr Ala
530 535 540
Ala Val Tyr Arg Asn Asn Glu Thr Glu Pro Met Cys Glu Leu Lys Met
545 550 555 560
Ala Val Ser Ser Leu Asp Ser Lys Gln Leu Gly Val Tyr Pro Ser Ala
565 570 575
Gly Gln Tyr Leu Tyr Lys Gly Asn Glu Tyr Tyr Val Lys Ile Lys Ala
580 585 590
Gly Ser Val Thr Asp Val Leu Gly Ser Asn Pro Asn Glu Ala Ile Asn
595 600 605
Leu His Tyr Thr Gly Asn Tyr Glu Arg Glu Ile Asn Tyr Asp Asp Val
610 615 620
Thr Leu Phe Lys Asp Asp Phe Asn Asn Gly Leu Gly Gln Phe Leu Phe
625 630 635 640
Tyr Glu Gly Asp Lys Arg Glu Pro Val Glu Ser Met Val Gln Trp Gly
645 650 655
Phe Thr Ala Thr Thr Thr Pro Trp Ser Ile Val Trp Asp Glu Asp Asn
660 665 670
Thr Ser Asp Leu Ala Ala Ala Ser His Ser Met Tyr Ser Pro Ala Gly
675 680 685
Lys Ser Asp Asp Trp Met Val Thr Thr Gln Ile Phe Ile Pro Ser Asn
690 695 700
Gln Cys Tyr Leu Arg Trp Glu Ser Gln Ser Tyr Leu Lys Ser Lys Gly
705 710 715 720
Asp Arg Leu Lys Ile Met Val Trp Glu Tyr Asp Pro Val Leu Asn Ala
725 730 735
Leu Asn Asp Asp Leu Ile Ala Lys Phe Lys Asn Glu Gly Lys Val Ile
740 745 750
Tyr Asp Glu Phe Glu Lys Pro Gly Glu Asp Glu Asn Lys Leu Ala Gly
755 760 765
Glu Trp Thr Ser His Ile Val Lys Leu Glu Glu Phe Lys Gly Lys Asn
770 775 780
Val Tyr Ile Ala Phe Val Asn Glu Asn Glu Asp Gln Ser Ala Ile Phe
785 790 795 800
Ile Asp Asn Val Glu Val Thr Asn Asp Gln Lys Phe Leu Val Gly Leu
805 810 815
Thr Asn Glu Thr Ser Val Val Asn Gln Lys Glu Ile Lys Ile Ser Gly
820 825 830
Arg Ile Ser Ile Asn Ala Leu Glu Asp Thr Tyr Gln Ser Val His Ile
835 840 845
Ile Met Lys Asp Ala Asn Gly Asn Ala Ile Asp Glu Ile Ser Glu Ser
850 855 860
Gly Leu Ser Leu Lys Asn Gly Asp Lys Tyr Asp Phe Ala Phe Gln Lys
865 870 875 880
Ala Leu Pro Leu Ser Val Gly Ile Ala Asn Lys Phe Thr Leu Asp Ile
885 890 895
Thr Leu Asp Asp Glu Glu Lys Thr Thr Gly Tyr Ser Ile Lys Asn Leu
900 905 910
Ala Phe Ala Pro Thr Lys Arg Leu Val Ile Glu Glu Phe Thr Gly Thr
915 920 925
Asp Cys Pro Asn Cys Pro Leu Gly Ile Leu Ala Leu Gly Asn Met Glu
930 935 940
Lys Met Phe Gly Asp Gln Ile Ile Pro Met Ala Ile His Thr Tyr Asp
945 950 955 960
Gly Asp Ile Tyr Ser Thr Lys Glu Leu Glu Glu Tyr Ser Ala Phe Leu
965 970 975
Asn Phe Ser Gly Ala Pro Ser Gly Val Val Asn Arg Gln Gly Gly Ala
980 985 990
Thr Pro Val Pro Ser Tyr Pro Met Ala Ser Val Lys Asn Thr Glu Gly
995 1000 1005
Lys Val Asn Tyr Ile Phe Thr Asn Gly Ser Asp Leu Trp Leu Asp
1010 1015 1020
Gln Ala Glu Lys Glu Phe Lys Val Ala Ala Asp Ala Glu Leu Asn
1025 1030 1035
Ile Thr Ala Lys Tyr Gln Asp Gly Lys Ile Val Val Pro Cys Thr
1040 1045 1050
Tyr Lys Tyr Ala Leu Asn Ala Thr Asp Leu Asn Val Ser Leu Phe
1055 1060 1065
Met Ala Ile Leu Glu Asp Asn Leu Thr Arg Tyr Gln Cys Asn Asn
1070 1075 1080
Leu Gly Gly Thr Ser Asp Pro Asn Leu Gly Glu Trp Gly Leu Gly
1085 1090 1095
Gly Gln Tyr Ala Ala Ser Val Val Ala Pro Tyr Thr Phe Asn Asp
1100 1105 1110
Val Val Arg Ser Ile Pro Ser Ala Tyr Tyr Gly Val Ser Gly Leu
1115 1120 1125
Ile Pro Ser Ser Val Thr Ala Gly Glu Glu Asn Thr Thr Ser Leu
1130 1135 1140
Asp Leu Asn Ile Pro Glu Asn Ile Ile Asp Leu Thr Asn Cys Lys
1145 1150 1155
Val Val Cys Met Met Ile Asp Ala Asn Thr Gly Asn Ile Ile Asn
1160 1165 1170
Ala Ala Arg Ala Asp Ile Asn Thr Asp Asp Tyr Asn Ser Ile Glu
1175 1180 1185
Gln Thr Lys Ala Asp Glu Ser Ile Ser Val Asn Ala Asp Asn Asn
1190 1195 1200
Thr Ile His Ile Asn Ala Gly Thr Thr Ala Gln Val Thr Val Tyr
1205 1210 1215
Ser Val Asp Gly Ser Val Leu Asp Gln Val Val Ile Glu Gly Glu
1220 1225 1230
Gly Ser Leu Lys Leu Gln Gly His Lys Gly Ile Val Leu Val Glu
1235 1240 1245
Val Thr Thr Glu Asn Thr Arg Val Val Lys Lys Val Phe Val Lys
1250 1255 1260
<210> 7
<211> 3792
<212> DNA
<213> Paraprevotella clara
<400> 7
atgaaaaaaa aatcttgttt ggtgttcctc tttctattgg gaatactatg gaatgtacaa 60
gtccacgctg acggtgtgag ccaacctacg ttccatacct tcaagttcac agaccaagct 120
acactccaga acatgtccga caatgggaaa tgggctgtag ctttcggtac gaacggtaac 180
agtgtggaag attttccgaa gctcatagac ttggctacgg acaaggctac cgaacttttg 240
agcgaaagcg gaaccgcatt aggcaacggc gcctacgatg tcaataacga gggaactctc 300
gtagtaggac gatatgaagg aaatccggcc gtttggacca aaaccaataa caaatggaca 360
gtacttcctg ttcctacggg atgggacggt ggtttagtta acgccattac tcccgacgga 420
aagtgggcca taggtagggc caccaaaggt cagtatgacg aaactcccgt actgtgggat 480
atgagtaaag gcggtataat cacagaaacg cctaatatcc cggtaaaaga catgaccgga 540
ctagaccagc accaaagccg ctttgtaggc atttccgctg acggacgtta tatcgtaggt 600
tgcctgtctt tcagctatat ccagcctatc gcttgttgtt attacgtcta tgatcgggat 660
aagcaaacat ataaattcat cggattcgat gtagatgaga attacaaatg gactccgttg 720
gctcagaatc tggcctttat cgatgatgca cgtatcagca acaacggaaa gtatataacg 780
ggtactgctt atatggtaaa agaaggaaat gccgaataca aaactccttt tctttatgac 840
gttgaagccg gttccttctc tatttacgac cagaaccagg accaaggtct catcggacta 900
tgttgtgata acgagggtca tgtgttcggt gcttcacctt cagacagtcc ggcacgtgaa 960
tggagcatac gcgtaggcca atactggtac agccttaccc aaatcctgaa acagcattac 1020
aatatagact ttaatacagt ttcggggttt gaaaatagcg gaactccgat tgcccttaat 1080
gtagaaggta ctaaaatcgc cgtcttcgtc tataaaggtg aaagttatat actggagatg 1140
cctcagtcgc tcgttgaact gaccaatgaa attgatttat tggccaacta ttccgtcagc 1200
ccggaggaag gttcacaatt ctctgccctg aaatcactaa gcctcacttt tgaccgcgac 1260
attgaagtga caggaaagag cagtgccgta atcttaaaag acgaaaacgg gaacaaagta 1320
acaagttcag taacgttcaa acgtaacgcc aataacagca aaatcgtcga tattagtttc 1380
cgtggcgcaa acctcgaaga cggtaaaaaa tatactgttt ctgtcccggc tggttccatc 1440
gttattaatg gtgatgccga gaaggcctgc aaggaaatta acattagcta caccggacgg 1500
gccaataagc cagtcgcacc tactagcatt gcaccggttg acaactcaac cctatcattg 1560
ttgaacttct ctacgaaccc tatcatcatt aattttgatg caggaatcgc cttgaccgac 1620
acggcttacg ccgcagttta ccgcaataac gaaacagagc cgatgtgcga actgaaaatg 1680
gctgtgtcct ctctcgacag taaacaatta ggcgtgtacc cgtctgccgg acaatatttg 1740
tataagggta atgagtatta cgtaaaaatc aaggcaggat cagtcacgga cgttttaggt 1800
agcaatccga acgaagccat caacctgcac tacacgggta actatgaacg tgaaatcaac 1860
tacgacgatg taacattgtt caaagacgat tttaacaacg gtttgggcca attcctcttc 1920
tatgaaggag ataaacgcga accggtggaa agcatggtcc aatggggatt cactgcgact 1980
acaactcctt ggagcatcgt atgggatgag gacaacacca gcgatttggc agcagcttca 2040
cattccatgt actctccggc aggaaagtct gacgactgga tggtaacaac gcaaattttc 2100
attccgtcta accaatgcta tttaagatgg gagtcacaga gttatcttaa gagtaaagga 2160
gaccgtttga aaataatggt ttgggaatac gaccccgtat tgaatgctct gaacgacgat 2220
cttatcgcta aattcaagaa tgaaggaaag gttatttacg acgaatttga aaaacctggt 2280
gaagacgaaa acaaattagc aggagaatgg acttcccaca tcgtaaaact ggaagagttt 2340
aaaggcaaaa acgtatatat tgctttcgtg aacgagaacg aagatcaaag tgccatcttt 2400
atagataatg tggaagtaac caacgaccag aagtttttgg tgggtttgac caacgaaacc 2460
tccgtggtaa accaaaaaga aattaaaatc agtggccgta tcagtatcaa tgctttggaa 2520
gatacttatc agagcgtaca cattatcatg aaagacgcta atggtaacgc catagatgaa 2580
atcagcgaat ctggtctctc tttaaagaat ggtgacaagt atgactttgc tttccaaaaa 2640
gctctcccgc taagcgtagg catcgcaaac aagttcacat tggatatcac tttagatgat 2700
gaagaaaaaa caacaggata ttctatcaag aatctggctt tcgcaccgac caagcgtttg 2760
gtgatagaag aatttacagg tacggattgc ccgaactgtc cgctcggaat tctggctttg 2820
ggcaacatgg agaaaatgtt tggcgaccaa atcatcccca tggctatcca cacttacgat 2880
ggcgacatct actcgaccaa agaacttgaa gagtattccg cgttcctgaa cttcagtggt 2940
gctcccagcg gagtagtcaa tcgtcagggc ggagctaccc ccgttccttc ttatccaatg 3000
gccagcgtta aaaacacaga aggaaaagtg aattatatat tcacaaacgg ttcggattta 3060
tggttagacc aggccgagaa agaattcaaa gtggctgcag atgccgaact gaacatcacg 3120
gcgaagtatc aggacggcaa aatcgtagtg ccttgtacct ataaatatgc actgaatgcc 3180
acggatctaa atgtaagttt gttcatggca attctggaag ataacttaac ccgttatcaa 3240
tgtaacaact tgggaggtac ttctgacccg aatctcggag aatggggatt gggcggccag 3300
tacgctgcca gtgtagtggc accttacacg ttcaacgatg tagttcgcag catacccagt 3360
gcttactatg gtgtaagcgg tctgattcct tcttccgtaa cagccggtga ggagaatacg 3420
actagtctgg acttaaatat cccggaaaat atcattgatc tcacaaactg taaagtagta 3480
tgtatgatga tcgatgccaa cacgggaaac atcatcaatg cagccagagc tgatatcaat 3540
acagacgatt ataattccat cgaacaaaca aaagcagacg aaagtatctc cgtgaatgct 3600
gacaacaaca ccatccacat taatgccggc acaacggcac aagttacggt gtacagtgta 3660
gacggaagtg ttctcgacca agttgttatc gaaggagaag gtagcctgaa gctgcaaggc 3720
cataagggta tcgtcttggt tgaggtaacg accgagaata ctcgtgttgt gaagaaagta 3780
ttcgtaaaat aa 3792
<210> 8
<211> 1198
<212> PRT
<213> Paraprevotella xylaniphila
<400> 8
Met Phe Pro Lys Leu Ile Asp Leu Ser Ser Asn Gln Val Thr Glu Leu
1 5 10 15
Leu Thr Gly Glu Asp Tyr Ser Asn Leu Thr Ser Ala Ala Tyr Asp Val
20 25 30
Thr Asp Asp Gly Ser Met Val Ala Gly Ser Tyr Asp Gly Gln Pro Ala
35 40 45
Ile Tyr Ile Thr Ala Gln Lys Arg Trp Ser Met Leu Ala Val Pro Asp
50 55 60
Ser Cys Val Gly Gly Glu Val Thr Ala Ile Thr Pro Asp Gly His Tyr
65 70 75 80
Ala Val Gly Arg Gly Leu Tyr Ala Asn Thr Tyr Met Glu Ala Val Val
85 90 95
Met Trp Asp Leu Gln Gln Asp Gly Leu Leu Leu Asp Leu Ser Gly Ile
100 105 110
Pro Phe Glu Asp Met Val Gly Glu Asn Asn Asn Gln Arg Arg Leu Ile
115 120 125
Gly Val Thr Pro Asp Gly Asn Lys Val Leu Gly Cys Ile Ser Tyr Ser
130 135 140
Tyr Val Ser Pro Ala Gly Ile Phe Phe Phe Val Tyr Asp Arg Gln Thr
145 150 155 160
Lys Lys Cys Gln Pro Ile Gly Phe Asp Thr Thr Thr Glu Thr Thr Glu
165 170 175
Glu Gly Pro Gln Thr Val Trp Thr Pro Arg Ala Glu Gly Leu Tyr Phe
180 185 190
Ile Asn Ser Ala Asp Leu Ser Ala Asn Gly Arg Tyr Ile Thr Gly Glu
195 200 205
Ala Tyr Met Glu Asn Asp Ala Ala Lys Gly Tyr Val Tyr Asp Ile Glu
210 215 220
Thr Asp Lys Phe Thr Leu Tyr Ser Asp Ala Gln Asp Asp Met Pro Gly
225 230 235 240
Phe Thr Ala Cys Ser Asn Gly Ile Val Leu Gly Ala Thr Pro Pro Thr
245 250 255
Asn Pro Tyr Arg Asp Trp Ser Val Arg Val Gly Glu Tyr Trp Tyr Pro
260 265 270
Ile Ser Leu Ile Leu Ser Gln Arg Tyr Gly Ile Asn Phe Ser Thr Glu
275 280 285
Thr Asn Phe Ser Asn Thr Gly Thr Pro Ile Ala Thr Ser Leu Asp Gly
290 295 300
Lys Lys Ile Val Val Phe Val Ala Pro Gln Lys Glu Asn Tyr Ile Leu
305 310 315 320
Glu Leu Pro Glu Thr Leu Asp Ala Ala Thr Thr Asp Ile Asp Leu Leu
325 330 335
Gly Asn Tyr Ile Val Asn Pro Lys Ala Gly Ser Glu Phe Ser Ser Leu
340 345 350
Lys Thr Val Thr Phe Thr Phe Pro Tyr Glu Ile Glu Leu Leu Gly Asp
355 360 365
Lys Ser Asn Val Thr Phe Thr Asp Glu Asn Gly Lys Lys Thr Gly Thr
370 375 380
Ile Leu Ser Leu Lys Thr Asp Asp Lys Asp Leu Asn Ile Ser Phe Arg
385 390 395 400
Ser Thr Asn Leu Gln Ala Gly Val Lys Tyr Thr Leu Thr Leu Ser Ala
405 410 415
Gly Thr Ile Ala Ile Lys Gly Asp Thr Glu Arg Lys Asn Lys Glu Leu
420 425 430
Ser Val Thr Tyr Thr Gly Arg Asp Asn Val Pro Val Lys Met Val Ser
435 440 445
Ala Thr Pro Tyr Glu Asn Thr Glu Val Lys Gln Leu Asn Tyr Thr Asn
450 455 460
Asn Pro Ile Thr Ile Thr Phe Asp Thr Gln Leu Ser Leu Asn Glu Asp
465 470 475 480
Ala Ile Gly Tyr Leu Tyr Lys Asp Asp Glu Thr Thr Pro Asp Gly Thr
485 490 495
Leu Arg Leu Gly Val Thr Asp Asn Lys Leu Tyr Val Phe Pro Val Thr
500 505 510
Ala Arg Asn Leu Phe Lys Gly His Asn Tyr Lys Val Thr Val Pro Arg
515 520 525
Asn Ala Val Tyr Asp Ile Met Gly Asn Asn Gly Asn Asp Ser Ile Thr
530 535 540
Leu His Tyr Ile Gly Gly Phe Glu Arg Glu Val Ser Tyr Asp Asn Asp
545 550 555 560
Thr Leu Phe Val Glu Asp Phe Asn Asn Gly Phe Thr Lys Thr Met Leu
565 570 575
Tyr Glu Gly Asp His Asn Lys Pro Thr Glu Glu Met Gln Lys Met Asp
580 585 590
Phe Asn Asp Ser Asp Asn Tyr Pro Trp Thr Met Val Arg Asp Glu Asp
595 600 605
Asp Pro Asn Asn Tyr Ala Ala Ala Ser Thr Ser Leu Tyr Asp Pro Val
610 615 620
Gly Arg Ser Asp Asp Trp Met Val Thr Ala His Ile Tyr Ile Pro Asp
625 630 635 640
Asp Arg Cys Tyr Leu Gln Phe Asp Ala Gln Ser Phe Arg Glu Asn Lys
645 650 655
Lys Asp Ser Leu Tyr Val Leu Val Trp Val Thr Glu Asp Glu Phe Ser
660 665 670
Ser Met Asn Ser Glu Arg Ile Ala Lys Phe Lys Ala Glu Cys Asp Thr
675 680 685
Ile Tyr Ala Gly Ile Glu Thr Pro Gly Asp Tyr Glu Asp Phe Leu Ala
690 695 700
Gly Asp Trp Thr Glu His Thr Ala Ser Leu Gln Pro Tyr Ala Gly Lys
705 710 715 720
Asn Ile Tyr Ile Ala Phe Val Asn Ser Asn Glu Asn Gln Ser Cys Val
725 730 735
Phe Val Asp Asn Leu Leu Val Arg Gln Asp Gln Gln Phe Gln Ile Ser
740 745 750
Val Thr Thr Asp Gln Thr Val Val Lys Ala Lys Glu Ile Lys Ile Ser
755 760 765
Gly Gln Ile Arg Ile Thr Thr Pro Asp His Thr Tyr Ser Thr Leu Glu
770 775 780
Leu Thr Leu Lys Asp Ser Gln Gly Asn Met Val Asp Gln Ile Ser Glu
785 790 795 800
Ser Asn Leu Glu Leu Lys Glu Asn Asp Lys Tyr Asp Phe Ser Phe Asp
805 810 815
Lys Pro Leu Pro Leu Val Ile Gly Gln Glu Asn Glu Tyr Thr Ile Leu
820 825 830
Val Gln Leu Asp Glu Glu Gln Ser Glu Pro His Tyr Ser Val Lys Asn
835 840 845
Leu Ser Phe Gln Thr Val Lys Arg Val Val Leu Glu Glu Gly Thr Gly
850 855 860
Gln Asp Cys Pro Asn Cys Pro Gln Gly Ile Leu Ala Ile Glu Asn Leu
865 870 875 880
Glu Asp Val Tyr Gly Asp Leu Phe Ile Pro Val Ser Leu His Thr Tyr
885 890 895
Ser Gly Asp Pro Tyr Gly Thr Gly Phe Glu Ser Tyr Ala Ala Phe Leu
900 905 910
Asn Ile Thr Gly Tyr Pro Ser Gly Thr Val Asn Arg Thr Asn Gln Val
915 920 925
Pro Val Gly Ala Met Thr Ser Asp Glu Asn Gly Asn Ala Thr Phe Thr
930 935 940
Ser Thr Ala His Pro Cys Trp Leu Asp Tyr Val Ala Ala Glu Met Glu
945 950 955 960
Ile Ser Ala Asp Ala Asn Phe Asp Ile Ala Lys Ala Lys Val Glu Ala
965 970 975
Asp Ser Thr Ile Asn Met Thr Tyr Lys Tyr Gln Tyr Ala Leu Asn Val
980 985 990
Lys Gly Gln Asn Val Asn Leu Phe Thr Val Ile Leu Glu Asp Ser Leu
995 1000 1005
Ile Gly Lys Gln Arg Asn Ser Tyr Gly Ser Asn Ser Asp Pro Leu
1010 1015 1020
Leu Gly Glu Trp Gly Gln Gly Gly Lys Tyr Ser Ala Ala Thr Val
1025 1030 1035
Arg Asn Tyr Pro Phe Val Asp Val Val Arg Gly Val Val Gly Asp
1040 1045 1050
Thr Phe Tyr Gly Thr Ala Gly Tyr Ile Gln Pro Asn Val Glu Ala
1055 1060 1065
Gly Lys Glu Tyr Thr Ala Glu Ala Thr Phe Lys Leu Pro Ala Lys
1070 1075 1080
Val Leu Lys Ala Lys Asn Cys Lys Ile Val Thr Met Met Ile Asp
1085 1090 1095
Ala Asn Ser Gly Lys Val Ile Asn Ser Ala Arg Ala Lys Leu Asp
1100 1105 1110
Val Ser Asp Phe Ala Asn Ala Ile Ala Asp Ile Gln Ala Glu Lys
1115 1120 1125
Pro Asn Val Glu Ile Ala Val Val Asn Gly Gly Val Glu Ile Thr
1130 1135 1140
Thr Asp Ala Pro Ala Gln Ile Asn Leu Tyr Ser Leu Asn Gly Thr
1145 1150 1155
Leu Ile Gly Thr Ala Lys Ser Gln Gly Phe Val Ser Leu Ser Thr
1160 1165 1170
Asn Gly Tyr Arg Gly Leu Thr Leu Val Lys Val Thr Thr Gly His
1175 1180 1185
Gln Thr Leu Val Lys Lys Val Ile Val Lys
1190 1195
<210> 9
<211> 3597
<212> DNA
<213> Paraprevotella xylaniphila
<400> 9
atgtttccca aacttatcga cttaagctct aaccaagtca cggaactgct taccggcgaa 60
gactactcca acttgacaag cgcggcctat gacgtgacag acgacggtag catggtggca 120
ggaagctatg acggccaacc ggctatctat atcacagccc aaaaacgttg gagcatgttg 180
gccgtacccg acagctgtgt aggtggagaa gtcacagcca tcacccccga tggccattac 240
gccgtgggaa ggggcctata cgcaaatact tacatggaag ctgtcgtcat gtgggattta 300
caacaggacg gcctgcttct cgatttatcc ggcattccat ttgaagacat ggtaggcgaa 360
aataacaacc aacgccgtct gataggtgta acccctgatg gaaataaagt attaggctgc 420
atatcataca gctacgtaag ccctgccggc atcttctttt ttgtatatga ccgccaaacc 480
aagaagtgcc agcctatcgg tttcgatacc acaacagaaa caacagaaga aggccctcaa 540
acggtttgga caccgcgtgc cgaaggactt tattttatca actcagccga tctcagtgcc 600
aacggacgat atatcacagg cgaagcgtac atggaaaatg atgcagccaa aggttacgtt 660
tacgacatag aaacggataa attcacccta tacagcgatg cacaagatga catgcccggt 720
tttacagcat gcagtaatgg aatcgtgttg ggtgcaaccc ctccgaccaa tccttatcgt 780
gactggagcg tgcgcgtggg agaatattgg tatcccattt ctttgattct ctctcaacgt 840
tatggcatca acttctctac ggagactaat ttctccaata caggaacacc gatagccact 900
agcctggatg gcaaaaaaat cgttgtattc gtagctccac aaaaagaaaa ttatattttg 960
gaattaccgg agaccttgga tgcagcaacc acagatattg acttattagg taactacatc 1020
gtcaatccga aggccggaag tgaattctcg tctcttaaaa ccgtcacgtt tacattcccc 1080
tatgaaatag aactattagg agacaaaagc aatgtgacct ttacggacga aaacgggaaa 1140
aaaacgggta cgatcttaag tcttaaaacg gatgacaagg atttgaatat cagcttccgc 1200
agcacaaacc tgcaagcagg agtgaaatat accctgaccc tttctgccgg tactatcgcc 1260
atcaaaggcg acacggaacg taaaaacaaa gagctctccg tcacttatac aggacgtgat 1320
aatgttccgg taaaaatggt atctgccact ccgtatgaaa acacggaagt aaaacagctg 1380
aattacacca acaatcctat aaccataacg ttcgacaccc aattatccct caatgaggac 1440
gctatcggtt acctttacaa agacgacgaa acgaccccgg acggcacctt acgtttagga 1500
gtcacagaca ataaattata tgtgttccct gtcacagccc gcaatttgtt taagggacac 1560
aactacaaag tgacggttcc tcgcaacgcg gtatatgaca tcatgggcaa taatggtaat 1620
gactccatca ccttgcacta catcggcgga ttcgaacgcg aagtaagtta tgacaacgat 1680
acactcttcg tagaggattt caacaacggg ttcactaaga ccatgttgta tgaaggagac 1740
cataataaac ccacggagga aatgcaaaaa atggatttca acgatagcga taactatcct 1800
tggaccatgg tacgcgatga agatgatccc aataactatg ccgctgcttc cacttcgttg 1860
tatgaccccg tcgggaggtc tgacgactgg atggtcacag cgcatatcta tatcccggac 1920
gacagatgtt atttgcaatt tgatgcccaa agtttccggg agaataaaaa agatagcctt 1980
tatgtattgg tttgggtaac tgaagatgag ttttcttcaa tgaacagcga acgtatcgcc 2040
aagttcaaag ccgaatgcga caccatatat gccggcatcg aaactccggg agactatgaa 2100
gacttccttg ccggagactg gacggaacac acggcaagcc tacaacctta tgccggcaaa 2160
aacatctaca tcgcttttgt caacagcaat gaaaaccaaa gttgtgtatt cgtagacaac 2220
cttttggtga gacaagacca acagttccaa atcagcgtga ccacagacca aacggtggta 2280
aaagccaaag agataaaaat cagtggacaa atcagaatca cgacaccaga ccacacttac 2340
tcgaccttgg aactgacctt gaaagattct cagggaaaca tggtagacca aattagcgaa 2400
agcaacctgg aactgaaaga aaatgataag tatgacttct ccttcgacaa gccattaccc 2460
ttggttatcg gccaagaaaa cgaatatact atccttgtcc aattggatga agaacaaagt 2520
gaacctcact attcagtaaa gaacctgtct ttccagaccg ttaaacgggt tgtattggaa 2580
gaaggtaccg gacaggattg tccgaactgt ccgcaaggca ttctggccat tgaaaatctg 2640
gaagatgttt acggcgactt gttcatccct gtaagccttc atacttattc cggtgaccct 2700
tatggaactg gctttgaaag ctatgccgca ttcctgaaca tcacaggata cccttcgggc 2760
accgtcaatc gtaccaatca agtccctgtg ggagccatga cttcggatga gaatggaaat 2820
gccaccttca cctctacggc acatccttgt tggttagact atgtagctgc agaaatggaa 2880
ataagcgccg atgcaaactt cgacatcgca aaggctaagg tagaggcaga cagtacaatt 2940
aacatgactt ataaatacca atatgcccta aacgtgaaag gacaaaacgt aaacctcttt 3000
actgtaatat tagaagatag cctgattggc aagcaaagaa actcttatgg ttccaacagc 3060
gacccgcttt taggcgaatg gggccaggga ggtaaatact ctgccgccac cgtacgtaat 3120
tatcctttcg tagatgtagt gagaggcgta gtcggtgata cattctacgg cactgccggt 3180
tatatccaac cgaatgtcga ggccggtaag gaatatacgg ccgaagccac tttcaagctt 3240
cctgcaaaag tcctgaaagc taaaaactgt aagatcgtaa ccatgatgat agacgccaac 3300
agcggcaaag tcatcaattc ggcgcgcgcc aaattggatg tatcggattt tgctaatgcc 3360
atagctgaca tacaagcaga aaaaccgaat gtcgaaattg cagttgtaaa cggtggcgta 3420
gagattacca cagatgcccc ggcacaaatt aatttgtata gcctgaacgg cactttaatc 3480
ggaacagcca aaagccaagg ctttgtctca ttaagcacaa acggataccg tggccttact 3540
ttggtgaaag tgactaccgg acatcaaact ttagtgaaaa aagtcatcgt gaagtaa 3597
<210> 10
<211> 1260
<212> PRT
<213> Paraprevotella xylaniphila
<400> 10
Met Lys Lys Gly Leu Leu Lys Ile Ala Leu Ser Val Leu Gly Ala Ala
1 5 10 15
Phe Phe Leu Cys Ala Lys Ala Gln Gly Val Pro Gly Ala Ser Ile His
20 25 30
Pro Phe Val Phe Glu Asp Val Ser Ile Leu Thr Asn Ile Ser Asn Asn
35 40 45
Gly Lys Trp Ala Thr Ala His Gly Thr Glu Glu Ser Lys Thr Met Phe
50 55 60
Pro Lys Leu Ile Asp Leu Ser Ser Asn Gln Val Thr Glu Leu Leu Thr
65 70 75 80
Gly Glu Asp Tyr Ser Asn Leu Thr Ser Ala Ala Tyr Asp Val Thr Asp
85 90 95
Asp Gly Ser Met Val Ala Gly Ser Tyr Asp Gly Gln Pro Ala Ile Tyr
100 105 110
Ile Thr Ala Gln Lys Arg Trp Ser Met Leu Ala Val Pro Asp Ser Cys
115 120 125
Val Gly Gly Glu Val Thr Ala Ile Thr Pro Asp Gly His Tyr Ala Val
130 135 140
Gly Arg Gly Leu Tyr Ala Asp Thr Tyr Met Glu Ala Val Val Met Trp
145 150 155 160
Asp Leu Gln Gln Asp Gly Leu Leu Leu Asp Leu Ser Gly Ile Pro Phe
165 170 175
Glu Asp Met Val Gly Glu Asn Asn Asn Gln Arg Arg Leu Ile Gly Val
180 185 190
Thr Pro Asp Gly Asn Lys Val Leu Gly Cys Ile Ser Tyr Ser Tyr Val
195 200 205
Ser Pro Ala Gly Ile Phe Phe Phe Val Tyr Asp Arg Gln Thr Lys Lys
210 215 220
Cys Gln Pro Ile Gly Phe Asp Thr Thr Thr Glu Thr Thr Glu Glu Gly
225 230 235 240
Pro Gln Thr Val Trp Thr Pro Arg Ala Glu Gly Leu Tyr Phe Ile Asn
245 250 255
Ser Ala Asn Leu Ser Ala Asn Gly Arg Tyr Ile Thr Gly Glu Ala Tyr
260 265 270
Met Glu Asn Asp Ala Ala Lys Gly Tyr Val Tyr Asp Ile Glu Thr Asp
275 280 285
Lys Phe Thr Leu Tyr Ser Asp Ala Gln Asp Asp Met Pro Gly Phe Thr
290 295 300
Ala Cys Ser Asn Gly Ile Val Leu Gly Ala Thr Pro Pro Thr Asn Pro
305 310 315 320
Tyr Arg Asp Trp Ser Val Arg Val Gly Glu Tyr Trp Tyr Pro Ile Ser
325 330 335
Leu Ile Leu Ser Gln Arg Tyr Gly Ile Asn Phe Ser Thr Glu Thr Asn
340 345 350
Phe Ser Asn Thr Gly Thr Pro Ile Ala Thr Ser Leu Asp Gly Lys Lys
355 360 365
Ile Val Val Phe Val Ala Pro Gln Lys Glu Asn Tyr Ile Leu Glu Leu
370 375 380
Pro Glu Thr Leu Asp Ala Ala Thr Thr Asp Ile Asp Leu Leu Gly Asn
385 390 395 400
Tyr Ile Val Asn Pro Lys Ala Gly Ser Glu Phe Ser Ser Leu Lys Thr
405 410 415
Val Thr Phe Thr Phe Pro Tyr Glu Ile Glu Leu Leu Gly Asp Lys Ser
420 425 430
Asn Val Thr Phe Thr Asp Glu Asn Gly Lys Lys Thr Gly Thr Ile Leu
435 440 445
Ser Leu Lys Thr Asp Asp Lys Asp Leu Asn Ile Ser Phe Arg Ser Thr
450 455 460
Asn Leu Gln Ala Gly Val Lys Tyr Thr Leu Thr Leu Ser Ala Gly Thr
465 470 475 480
Ile Ala Ile Lys Gly Asp Thr Glu Arg Lys Asn Lys Glu Leu Ser Val
485 490 495
Thr Tyr Thr Gly Arg Asp Asn Val Pro Val Lys Met Val Ser Ala Thr
500 505 510
Pro Tyr Glu Asn Thr Glu Val Lys Gln Leu Asn Tyr Thr Asn Asn Pro
515 520 525
Ile Thr Ile Thr Phe Asp Thr Gln Leu Ser Leu Asn Glu Asp Ala Ile
530 535 540
Gly Tyr Leu Tyr Lys Asp Asp Glu Thr Thr Pro Asp Gly Thr Leu Arg
545 550 555 560
Leu Gly Val Thr Asp Asn Lys Leu Tyr Val Phe Pro Val Thr Ala Arg
565 570 575
Asn Leu Phe Lys Gly His Asn Tyr Lys Val Thr Val Pro Arg Asn Ala
580 585 590
Val Tyr Asp Ile Met Gly Asn Asn Gly Asn Asp Ser Ile Thr Leu His
595 600 605
Tyr Ile Gly Gly Phe Glu Arg Glu Val Ser Tyr Asp Asn Asp Thr Leu
610 615 620
Phe Val Glu Asp Phe Asn Asn Gly Phe Thr Lys Thr Met Leu Tyr Glu
625 630 635 640
Gly Asp His Asn Lys Pro Thr Glu Glu Met Gln Lys Met Asp Phe Asn
645 650 655
Asp Ser Asp Asn Tyr Pro Trp Thr Met Val Arg Asp Glu Asp Asp Pro
660 665 670
Asn Asn Tyr Ala Ala Ala Ser Thr Ser Leu Tyr Asp Pro Val Gly Arg
675 680 685
Ser Asp Asp Trp Met Val Thr Ala His Ile Tyr Ile Pro Asp Asp Arg
690 695 700
Cys Tyr Leu Gln Phe Asp Ala Gln Ser Phe Arg Glu Asn Lys Lys Asp
705 710 715 720
Ser Leu Tyr Val Leu Val Trp Val Thr Glu Asp Glu Phe Ser Ser Met
725 730 735
Asn Ser Glu Arg Ile Ala Lys Phe Lys Ala Glu Cys Asp Thr Ile Tyr
740 745 750
Ala Gly Ile Glu Thr Pro Gly Asp Tyr Glu Asp Phe Leu Ala Gly Asp
755 760 765
Trp Thr Glu His Thr Ala Ser Leu Gln Pro Tyr Ala Gly Lys Asn Ile
770 775 780
Tyr Ile Ala Phe Val Asn Ser Asn Glu Asn Gln Ser Cys Val Phe Val
785 790 795 800
Asp Asn Leu Leu Val Arg Gln Asp Gln Gln Phe Gln Ile Ser Val Thr
805 810 815
Thr Asp Gln Thr Val Val Lys Ala Lys Glu Ile Lys Ile Ser Gly Gln
820 825 830
Ile Arg Ile Thr Thr Pro Asp His Thr Tyr Ser Thr Leu Glu Leu Thr
835 840 845
Leu Lys Asp Ser Gln Gly Asn Met Val Asp Gln Ile Ser Glu Ser Asn
850 855 860
Leu Glu Leu Lys Glu Asn Asp Lys Tyr Asp Phe Ser Phe Asp Lys Pro
865 870 875 880
Leu Pro Leu Val Ile Gly Gln Glu Asn Glu Tyr Thr Ile Leu Val Gln
885 890 895
Leu Asp Glu Glu Gln Ser Glu Pro His Tyr Ser Val Lys Asn Leu Ser
900 905 910
Phe Gln Thr Val Lys Arg Val Val Leu Glu Glu Gly Thr Gly Gln Asp
915 920 925
Cys Pro Asn Cys Pro Gln Gly Ile Leu Ala Ile Glu Asn Leu Glu Asp
930 935 940
Val Tyr Gly Asp Leu Phe Ile Pro Val Ser Leu His Thr Tyr Ser Gly
945 950 955 960
Asp Pro Tyr Gly Thr Gly Phe Glu Ser Tyr Ala Ala Phe Leu Asn Ile
965 970 975
Thr Gly Tyr Pro Ser Gly Thr Val Asn Arg Thr Asn Gln Val Pro Val
980 985 990
Gly Ala Met Thr Ser Asp Glu Asn Gly Asn Ala Thr Phe Thr Ser Thr
995 1000 1005
Ala His Pro Cys Trp Leu Asp Tyr Val Ala Ala Glu Met Glu Ile
1010 1015 1020
Ser Ala Asp Ala Asn Phe Asp Ile Ala Lys Ala Lys Val Glu Ala
1025 1030 1035
Asp Ser Thr Ile Asn Met Thr Tyr Lys Tyr Gln Tyr Ala Leu Asn
1040 1045 1050
Val Lys Gly Gln Asn Val Asn Leu Phe Thr Val Ile Leu Glu Asp
1055 1060 1065
Ser Leu Ile Gly Lys Gln Arg Asn Ser Tyr Gly Ser Asn Ser Asp
1070 1075 1080
Pro Leu Leu Gly Glu Trp Gly Gln Gly Gly Lys Tyr Ser Ala Ala
1085 1090 1095
Thr Val Arg Asn Tyr Pro Phe Val Asp Val Val Arg Gly Val Val
1100 1105 1110
Gly Asp Thr Phe Tyr Gly Thr Ala Gly Tyr Ile Gln Pro Asn Val
1115 1120 1125
Glu Ala Gly Lys Glu Tyr Thr Ala Glu Ala Thr Phe Lys Leu Pro
1130 1135 1140
Ala Lys Val Leu Lys Ala Lys Asn Cys Lys Ile Val Thr Met Met
1145 1150 1155
Ile Asp Ala Asn Ser Gly Lys Val Ile Asn Ser Ala Arg Ala Lys
1160 1165 1170
Leu Asp Val Ser Asp Phe Ala Asn Ala Ile Ala Asp Ile Gln Ala
1175 1180 1185
Glu Lys Pro Asn Val Glu Ile Ala Val Val Asn Gly Gly Val Glu
1190 1195 1200
Ile Thr Thr Asp Ala Pro Ala Gln Ile Asn Leu Tyr Ser Leu Asn
1205 1210 1215
Gly Thr Leu Ile Gly Thr Ala Lys Ser Gln Gly Phe Val Ser Leu
1220 1225 1230
Ser Thr Asn Gly Tyr Arg Gly Leu Thr Leu Val Lys Val Thr Thr
1235 1240 1245
Gly His Gln Thr Leu Val Lys Lys Val Ile Val Lys
1250 1255 1260
<210> 11
<211> 3783
<212> DNA
<213> Paraprevotella xylaniphila
<400> 11
atgaagaaag gtctattaaa aattgccctc tcagtattag gagcggcgtt ttttctttgc 60
gcgaaagcgc aaggggttcc aggagcctcc attcatccgt ttgtatttga agatgtatcc 120
atccttacta acatttcaaa caacgggaaa tgggctaccg ctcatggtac agaggaaagc 180
aaaacgatgt ttcccaaact tatcgactta agctctaacc aagtcacgga actgcttacc 240
ggcgaagact actccaactt gacaagcgcg gcctatgacg tgacagacga cggtagcatg 300
gtggcaggaa gctatgacgg ccaaccggct atctatatca cagcccaaaa acgttggagc 360
atgttggccg tacccgacag ctgtgtaggt ggagaagtca cagccatcac ccccgatggc 420
cattacgccg tgggaagggg cctatacgca gatacttaca tggaagctgt cgtcatgtgg 480
gatttacaac aggacggcct gcttctcgat ttatccggca ttccatttga agacatggta 540
ggcgaaaata acaaccaacg ccgtctgata ggtgtaaccc ctgatggaaa taaagtatta 600
ggctgcatat catacagcta cgtaagccct gccggcatct tcttttttgt atatgaccgc 660
caaaccaaga agtgccagcc tatcggtttc gataccacaa cagaaacaac agaagaaggc 720
cctcaaacgg tttggacacc gcgtgccgaa ggactttatt ttatcaactc agccaatctc 780
agtgctaacg gacgatatat cacaggcgaa gcgtacatgg aaaatgatgc agccaaaggt 840
tacgtttacg acatagaaac ggataaattc accctataca gcgatgcaca agatgacatg 900
cccggtttta cagcatgcag taatggaatc gtgttgggtg caacccctcc gaccaatcct 960
tatcgtgact ggagcgtgcg cgtgggagaa tattggtatc ccatttcttt gattctctct 1020
caacgttatg gcatcaactt ctctacggag actaattttt ctaatacagg aacaccgata 1080
gccactagcc tggatggcaa aaaaatcgtt gtattcgtag ctccacaaaa agaaaattat 1140
attttggaat taccggagac cttggatgca gcaaccacag atattgactt attaggtaac 1200
tacatcgtca atccgaaggc cggaagtgaa ttctcgtctc ttaaaaccgt cacgtttaca 1260
ttcccctatg aaatagaact attaggagac aaaagcaatg tgacctttac ggacgaaaac 1320
gggaaaaaaa cgggtacgat cttaagtctt aaaacggatg acaaggattt gaatatcagc 1380
ttccgcagca caaacctgca agcaggagtg aaatataccc tgaccctttc tgccggtact 1440
atcgccatca aaggcgacac ggaacgtaaa aacaaagagc tctccgtcac ttatacagga 1500
cgtgataatg ttccggtaaa aatggtatct gccactccgt atgaaaacac ggaagtaaaa 1560
cagctgaatt acaccaacaa tcctataacc ataacgttcg acacccaatt atccctcaat 1620
gaggacgcta tcggttacct ttacaaagac gacgaaacga ccccggacgg caccttacgt 1680
ttaggagtca cagacaataa attatatgtg ttccctgtca cagcccgcaa tttgtttaag 1740
ggacacaact acaaagtgac ggttcctcgc aacgcggtat atgacatcat gggcaataat 1800
ggtaatgact ccatcacctt gcactacatc ggcggattcg aacgcgaagt aagttatgac 1860
aacgatacac tcttcgtaga ggatttcaac aacgggttca ctaagaccat gttgtatgaa 1920
ggagaccata ataaacccac ggaggaaatg caaaaaatgg atttcaacga tagcgataac 1980
tatccttgga ccatggtacg cgatgaagat gatcccaata actatgccgc tgcttccact 2040
tcgttgtatg accccgtcgg gaggtctgac gactggatgg tcacagcgca tatctatatc 2100
ccggacgaca gatgttattt gcaatttgat gcccaaagtt tccgggagaa taaaaaagat 2160
agcctttatg tattggtttg ggtaactgaa gatgagtttt cttcaatgaa cagcgaacgt 2220
atcgccaagt tcaaagccga atgcgacacc atatatgccg gcatcgaaac tccgggagac 2280
tatgaagact tccttgccgg agactggacg gaacacacgg caagcctaca accttatgcc 2340
ggcaaaaaca tctacatcgc ttttgtcaac agcaatgaaa accaaagttg tgtattcgta 2400
gacaaccttt tggtgagaca agaccaacag ttccaaatca gcgtgaccac agaccaaacg 2460
gtggtaaaag ccaaagagat aaaaatcagt ggacaaatca gaatcacgac accagaccac 2520
acttactcga ccttggaact aaccttgaaa gattctcagg gaaacatggt agaccaaatt 2580
agcgaaagca acctggaact gaaagaaaat gataagtatg acttctcctt cgacaagcca 2640
ttacccttgg ttatcggcca agaaaacgaa tatactatcc ttgtccaatt ggatgaagaa 2700
caaagtgaac ctcactattc agtaaagaac ctgtctttcc agaccgttaa acgggttgta 2760
ttggaagaag gtaccggaca ggattgtccg aactgtccgc aaggcattct ggccattgaa 2820
aatctggaag atgtttacgg cgacttgttc atccctgtaa gccttcatac ttattccggt 2880
gacccttatg gaactggctt tgaaagctat gccgcattcc tgaacatcac aggataccct 2940
tcgggcaccg tcaatcgtac caatcaagtc cctgtgggag ccatgacttc ggatgagaat 3000
ggaaatgcca ccttcacctc tacggcacat ccttgttggt tagactatgt agctgcagaa 3060
atggaaataa gcgccgatgc aaacttcgac atcgcaaagg ctaaggtaga ggcagacagt 3120
acgattaaca tgacttataa ataccaatat gccctaaacg tgaaaggaca aaacgtaaac 3180
ctctttactg taatattaga agatagcctg attggcaagc aaagaaactc ttatggttcc 3240
aacagcgacc cacttttagg cgaatggggc cagggaggta aatactctgc cgccaccgta 3300
cgtaattatc ctttcgtaga tgtagtgaga ggcgtagtcg gtgatacatt ctacggcact 3360
gccggttata tccaaccgaa tgtcgaggcc ggtaaggaat atacggccga agccactttc 3420
aagcttcctg caaaagtcct gaaagctaaa aactgtaaga tcgtaaccat gatgatagac 3480
gccaacagcg gcaaagtcat caattcggcg cgcgccaaat tggatgtatc ggattttgct 3540
aatgccatag ctgacataca agcagaaaaa ccgaatgtcg aaattgcagt tgtaaacggt 3600
ggcgtagaga ttaccacaga tgccccggca caaattaatt tgtatagcct gaacggcact 3660
ttaatcggaa cagccaaaag ccaaggcttt gtctcattaa gcacaaacgg ataccgtggc 3720
cttactttgg tgaaagtgac taccggacat caaactttag tgaaaaaagt catcgtgaag 3780
taa 3783
<210> 12
<211> 1242
<212> PRT
<213> Prevotella rara
<400> 12
Met Gly Leu Leu Trp Ala Ile Pro Ser Ser Ala Gln Glu Ala Gln Leu
1 5 10 15
Thr Thr Val Ser Phe Pro Lys Ala Ala Ala Phe Thr Ser Leu Ser Asp
20 25 30
Asn Gly Leu Trp Ala Thr Ala Ala Gly Val Asn Asp Asp Asp Gln Ser
35 40 45
Lys Tyr Ala Tyr Pro Tyr Leu Ile Asn Val Glu Thr Gly Ala Leu Thr
50 55 60
Glu Leu Trp Val Glu Ala Asp Leu Ile Lys Ser Leu Glu Ala Thr Asp
65 70 75 80
Val Thr Asn Asp Gly Lys Ile Ile Val Gly Thr Tyr Asp Ser Lys Pro
85 90 95
Ala Tyr Tyr Asp Met Asn Gln Gly Lys Trp Ile Thr Leu Gln Ser Glu
100 105 110
Asn Pro Gly Lys Ala Thr Ser Val Thr Pro Asp Gly Lys Tyr Ile Ser
115 120 125
Gly Trp Ser Asn Ser Gly Ser Phe Ser Gly Asp Ala Tyr Val Glu Thr
130 135 140
Pro Leu Leu Trp Glu Lys Gln Ser Asp Gly Thr Tyr Arg Ala Ile Asp
145 150 155 160
Val Tyr Ala Glu Leu Pro Asn Phe Pro Lys Lys Thr Lys Leu Gly Thr
165 170 175
Asn Thr Gln Gln Val Arg Ile Asp Asn Val Ser Pro Asp Gly Asn Ile
180 185 190
Leu Ser Gly Ile Ile Asn Phe Val Thr Pro Ala Thr Val Cys Tyr Tyr
195 200 205
Val Tyr Asn Lys Thr Thr Gln Glu Cys Lys Tyr Val Asp Asn Ala Leu
210 215 220
Gly Glu Val Pro Asp Glu Thr Phe Val Asp Glu Ser Thr Met Ser Asn
225 230 235 240
Asn Gly Lys Tyr Leu Thr Gly Ile Val Gln Val Ala Gly Gly Ser Tyr
245 250 255
Ile Ser Ser Tyr Leu Tyr Asn Thr Ser Asp Asn Ser Cys Ile Leu Tyr
260 265 270
Asn Thr Glu Ser Glu Glu Gln Asp Arg Ala Gly Ser Ala Val Ser Asn
275 280 285
Thr Gly Val Val Phe Ala Cys Ser Pro Ala Val Asn Pro Val Arg Ser
290 295 300
Ala Tyr Val Arg Val Gly Ser Leu Trp Tyr Gly Ile Asp Glu Ile Leu
305 310 315 320
Ser Gly Arg Tyr Asp Met Asn Phe Tyr Glu Arg Thr Gly Tyr Asp Phe
325 330 335
Thr Gly Thr Ile Gln Gly Val Ser Asp Asp Glu Lys Thr Ile Ile Gly
340 345 350
Met Ser Glu Thr Lys Thr Lys Gly Tyr Ile Ile Arg Leu Pro Glu Thr
355 360 365
Ile Ser Glu Ala Ala Ser Ser Val Asn Pro Leu Gly Thr Tyr Ala Val
370 375 380
Ser Pro Ala Tyr Gly Ser Lys Phe Ala Lys Phe Ser Gln Met Lys Leu
385 390 395 400
Ala Phe Ser Lys Ser Ala Ala Val Thr Ser Gly Val Lys Ala Gln Phe
405 410 415
Met Asp Ala Ser Gly Lys Val Leu Arg Glu Tyr Asn Ile Thr Ala Gln
420 425 430
Ser Gly Asn Lys Thr Phe Thr Ile Gly Gly Ile Pro Gln Ala Leu Asn
435 440 445
Ala Gly Glu Glu Tyr Thr Met Lys Ile Pro Ala Gly Ala Phe Tyr Leu
450 455 460
Met Ala Asp Asn Ser Ile Lys Ser Asp Glu Ile Thr Ile Lys Tyr Ile
465 470 475 480
Gly Arg Ala Glu Ala Pro Ala Ala Val Gln Gln Val Ser Pro Ala Asp
485 490 495
Gln Ala Asn Val Ser Glu Ile Ser Ser Glu His Pro Val Gln Ile Leu
500 505 510
Phe Asp Leu Ser Ile Ala Ile Ser Glu Asp Ala Lys Ala Tyr Leu Tyr
515 520 525
Arg Asp Gly Gln Thr Ser Pro Val Cys Glu Leu Asn Phe Val Gln Gly
530 535 540
Ala Glu Ser Thr Ile Leu Leu Tyr Pro Ser Leu Lys Arg Tyr Leu Glu
545 550 555 560
Lys Asp Glu Lys Tyr Lys Val Val Val Glu Ala Gly Ser Val Thr Asp
565 570 575
Ile Met Gly Phe Cys Pro Asn Asn Glu Ile Thr Ile Asn Tyr Thr Gly
580 585 590
Ala Tyr Glu Pro Glu Ile Ser Ser Asp Gly Ser Leu Phe Ser Asp Asp
595 600 605
Phe Asn Asp Pro Ser Asn Ser Met Val Lys Tyr Leu Leu Tyr Glu Gly
610 615 620
Asp His Asn Lys Pro Ser Ser Ala Met Gln Asp Leu Gly Phe Asp Ala
625 630 635 640
Asp Asn Ser Pro Trp Leu Phe Val Ile Arg Glu Ser Glu Ser Ser Ser
645 650 655
Asp Tyr Cys Ala Ala Ser Thr Ser Ile Tyr Asn Pro Thr Gly Lys Ser
660 665 670
Asp Asp Trp Met Ala Ile Pro Arg Leu Ser Ile Glu Asn Ala Asp Tyr
675 680 685
Tyr Leu Ser Phe Asp Val Gln Ser Tyr Tyr Lys Ser Lys Ala Asp Arg
690 695 700
Leu Lys Val Leu Val Leu Glu Asp Asp Ala Val Tyr Ser Asn Phe Thr
705 710 715 720
Thr Glu Leu Tyr Glu Lys Phe Lys Ala Asn Gly Lys Val Leu Tyr Asp
725 730 735
Glu Gln Leu Ser Pro Gly Glu Ser Glu Glu Asn Leu Thr Gly Asp Trp
740 745 750
Thr His Val Glu Lys Ser Leu Ala Glu Tyr Ala Gly Lys Asn Ile Tyr
755 760 765
Ile Ala Phe Val Asn Glu Asn Glu Asn Gln Ser Met Ile Phe Leu Asp
770 775 780
Asn Leu Lys Val Tyr Tyr Lys Gly Asp Phe Asn Phe Val Pro Asn Val
785 790 795 800
Glu Ser Thr Gln Val Asn Lys Glu Ser Thr Asn Val Gly Val Ile Val
805 810 815
Lys Val Thr Ser Asp Lys Thr Tyr Asn Thr Ile Asn Ala Thr Leu Thr
820 825 830
Ser Glu Asp Gly Ser Phe Lys Ser Thr Tyr Thr Asp Thr Pro Ser Lys
835 840 845
Pro Ile Thr Ser Ala Glu Asn Tyr Ser Phe Thr Phe Pro Asp Lys Leu
850 855 860
Pro Leu Thr Val Gly Ala Gln Asn Lys Tyr Thr Ile Ser Ile Asp Leu
865 870 875 880
Gly Gly Thr Val Tyr Ser Gln Thr Asn Thr Ile Ser Asn Leu Ala Phe
885 890 895
Glu Thr Thr Lys His Val Val Ile Glu Glu Ser Thr Gly Gln Gln Cys
900 905 910
Lys Asn Cys Pro Gln Gly Ile Leu Ala Met Glu Asn Leu Glu Lys Leu
915 920 925
Tyr Gly Glu Gln Val Ile Pro Ile Ala Ile His Ser Ser Ile Ile Gly
930 935 940
Thr Asp Gln Phe Ala Tyr Glu Asn Tyr Asn Thr Tyr Phe Gly Ile Thr
945 950 955 960
Ala Gln Pro Met Gly Leu Val Asn Arg Ile Asp Thr Leu Tyr Ala Pro
965 970 975
Met Tyr Val Asp Gly Ser Glu Gln Tyr His Phe Asp Ser Pro Glu Gly
980 985 990
Asn Asn Thr Phe Tyr Asp Val Ala Gln Arg Glu Phe Glu Asn Tyr Ala
995 1000 1005
Ile Ala Asp Val Ser Ile Asp Lys Ala Ile Tyr Asp Ala Ser Ser
1010 1015 1020
Lys Asn Ile Gln Ile Thr Gly Asn Val Asn Tyr Ala Leu Thr Met
1025 1030 1035
Asn Ser Leu Asn His Asn Leu Ala Phe Val Val Leu Glu Asp Gly
1040 1045 1050
Leu Glu Gly Pro Gln Leu Asn Gly Phe Tyr Val Leu Asp Gln Pro
1055 1060 1065
Ile Phe Gly Glu Phe Gly Lys Gly Gly Lys Tyr Gly Ser Ser Val
1070 1075 1080
Ala Val Val Thr Phe Asn Asp Val Ala Arg Lys Val Pro Asn Asp
1085 1090 1095
Asn Phe Ala Gly Glu Ser Gly Phe Ile Pro Val Ser Val Thr Ala
1100 1105 1110
Gly Gln Pro Val Ala Phe Asn Lys Val Phe Ser Leu Pro Glu Asn
1115 1120 1125
Val Lys Asn Trp Asp Asn Thr Lys Val Val Val Met Leu Leu Asp
1130 1135 1140
Ala Asn Thr Glu Arg Val Leu Asn Ala Ala Arg Leu Lys Met Ser
1145 1150 1155
Ala Gly Thr Ala Gly Ile Asn Asp Ala Thr Val Ser Glu Asn Gly
1160 1165 1170
Ile Thr Ile Ser Gly Glu Asn Gly Ser Ile Ser Val Asn Gly Gly
1175 1180 1185
Ser Asp Leu Asn Val Thr Val Tyr Asp Val Ser Gly Ser Val Ile
1190 1195 1200
Ser Asn Val Asn Ser Thr Ser Gly Ala Val Lys Val Ser Thr Gly
1205 1210 1215
Gly Lys Asn Gly Leu Phe Ile Val Lys Ala Thr Ser Asp Gly Val
1220 1225 1230
Ser Val Val Lys Lys Val Ile Val Lys
1235 1240
<210> 13
<211> 3729
<212> DNA
<213> Prevotella rara
<400> 13
ttgggcctgc tgtgggcaat tccgtcttca gcacaggaag cacagcttac cactgtgagt 60
ttccccaagg ctgcagcttt cacatctttg tcagacaatg gcttgtgggc tactgctgct 120
ggtgtgaatg atgatgatca aagtaagtat gcctatcctt atcttattaa tgtagagaca 180
ggtgctctta ctgaactttg ggttgaagca gacttgataa agtcactcga agcaacagac 240
gttactaatg atggaaaaat catcgttggt acctatgaca gcaagcctgc ttattatgac 300
atgaatcagg gtaagtggat cactcttcag tctgagaatc ctggaaaagc tacatctgta 360
acacctgatg gcaaatacat cagcggatgg agcaattctg gtagcttctc aggtgatgct 420
tatgttgaaa caccgctttt gtgggagaag cagtctgatg gtacatatcg tgcaattgat 480
gtgtatgctg aacttcctaa cttcccaaag aaaacaaagc ttggaacaaa cactcagcag 540
gtacgtatag acaatgttag tcctgatggt aacatccttt ctggtattat taactttgta 600
actcctgcaa ctgtttgcta ttatgtttac aacaagacaa ctcaggagtg caaatatgta 660
gacaatgctc tcggagaagt gcctgacgaa acatttgttg acgagtcaac aatgagtaac 720
aatggtaaat atcttaccgg tatcgtacag gtggctggtg gttcatacat ttcaagctat 780
ctttacaata caagtgacaa ctcttgtata ttatataata cagaaagcga agagcaagac 840
cgtgccggaa gtgctgtttc aaatacaggc gttgtatttg cttgcagtcc tgccgtaaac 900
cctgtccgtt ctgcttacgt acgtgtgggt agtctttggt acggaataga cgaaattctt 960
tctggccgtt atgatatgaa cttctatgag cgtactggct atgatttcac aggaacaatt 1020
caaggagttt ctgatgatga aaagaccata atcggtatgt ctgaaacaaa gacaaagggt 1080
tatatcattc gtcttcctga aactatcagc gaagctgcaa gcagcgtgaa tccattaggc 1140
acttatgccg tttctcctgc ttatggctct aagtttgcta aattcagtca gatgaagttg 1200
gctttctcta agtctgccgc tgtaacatct ggtgttaagg ctcagttcat ggacgcttct 1260
ggtaaggtgt tgagagagta taacattacg gctcagtctg gtaacaagac attcactatt 1320
ggcggtatac ctcaggctct gaatgcaggc gaggaatata caatgaagat tcctgctgga 1380
gcattctatt tgatggcaga caactctatc aagagtgatg aaatcacaat taagtatatc 1440
ggccgtgccg aggctccggc tgctgtacag caggtttctc ctgcagatca ggcaaatgta 1500
tctgaaataa gttcagagca tcctgtacag attctcttcg atttgtctat cgctatatct 1560
gaagatgcta aggcttatct gtatagagat ggtcagacat caccggtttg tgaacttaac 1620
tttgtgcagg gagcagaaag cacaatcctt ctgtatccta gcctcaagcg ttatcttgaa 1680
aaagatgaga aatataaggt tgttgtagag gctggttctg ttacagatat tatgggattc 1740
tgtccgaaca atgagattac aattaattac acaggtgcat acgagcctga aatcagttct 1800
gacggaagcc tcttctctga tgacttcaat gatcctagca actctatggt taaatacctc 1860
ctttatgaag gtgaccacaa caagccgtca agtgcaatgc aggatttagg ctttgatgcc 1920
gacaattctc cttggctttt cgtaatccgt gaatcagaaa gcagctctga ctattgtgca 1980
gcttcaacat ctatctataa tcctacagga aaatctgacg actggatggc aataccccgt 2040
ctgtcaatag aaaatgccga ctattatctc agctttgacg ttcagtcata ctataagtca 2100
aaggcagaca gacttaaggt tctcgttctt gaggatgatg ctgtttattc taacttcaca 2160
acagaacttt atgagaagtt caaggctaat ggtaaggttc tttatgacga gcagctttct 2220
cccggtgagt cagaggaaaa tctgacaggt gactggacac atgttgagaa atctctcgca 2280
gaatatgcag gtaagaatat ctatatcgct ttcgttaatg agaacgaaaa tcagagtatg 2340
atcttcttgg ataatctgaa ggtttactac aagggcgact tcaacttcgt tcctaatgta 2400
gaaagcacgc aggttaacaa agagtctaca aatgttggtg taatagttaa ggttacaagt 2460
gataagactt ataacactat taatgcaaca cttacttctg aagacggctc attcaagagc 2520
acttatacag atacacctag caagcctata acaagtgcag agaattattc tttcacattc 2580
cctgacaagt tgcctctgac tgtaggtgca cagaataagt atactataag cattgatctc 2640
ggcggtactg tttacagcca gacaaacact atcagcaacc ttgcttttga gacaacaaag 2700
catgtagtta tagaagaaag cactggtcag cagtgtaaga actgtcctca gggaatcctt 2760
gcaatggaaa atcttgagaa actttacggt gagcaggtta tacctatcgc tattcatagc 2820
tctataatcg gaacagacca gtttgcttat gagaactata atacatattt cggtataaca 2880
gcacagccta tgggtctggt taatcgtata gacactcttt atgcgccgat gtatgttgat 2940
ggaagcgagc agtatcactt cgattcacca gaaggcaaca atacattcta tgatgtagct 3000
cagcgtgagt tcgaaaacta tgcaattgca gatgtcagca tcgacaaggc tatttacgat 3060
gcttcaagca agaacataca gattacaggt aatgtaaact atgctctaac catgaactct 3120
ctgaaccaca atcttgcatt tgttgttctt gaggatggtt tggaaggacc tcagcttaat 3180
ggtttctacg ttctcgacca gccgatattc ggtgagttcg gtaagggcgg caagtatggt 3240
tcaagcgttg ctgttgtaac attcaacgat gttgcacgta aagtgccaaa cgacaacttc 3300
gcaggtgaat caggtttcat ccctgtaagc gtaactgcag gtcagcctgt tgcattcaat 3360
aaggtattca gcttgcctga aaacgttaag aactgggata acactaaggt tgttgttatg 3420
ttgctcgatg caaatacaga gcgcgttctg aatgctgcaa gacttaagat gtcagccgga 3480
acagctggaa tcaacgatgc aacagtatct gaaaacggta tcacaatttc tggtgagaac 3540
ggttcaataa gcgttaacgg cggaagcgat ctcaacgtta cagtttatga tgtaagcgga 3600
agcgtaataa gcaatgtcaa ctctacttct ggtgctgtta aggtaagcac aggcggaaag 3660
aacggcttgt ttatcgttaa ggctacaagc gacggcgtaa gcgttgttaa gaaagtaatc 3720
gttaaataa 3729
<210> 14
<211> 1230
<212> PRT
<213> Prevotella rodentium
<400> 14
Met Arg Lys Asn Lys Leu Lys Thr Phe Leu Leu Ala Val Ile Gly Ile
1 5 10 15
Ser Ser Phe Ala Cys Val Thr Leu His Ala Gln Asn Val Thr Glu Pro
20 25 30
Lys Met Thr Leu Tyr Asp Phe Lys Asp Gln Ser Met Ile Tyr Ser Leu
35 40 45
Ser Asp Asn Gly Gln Trp Ala Val Ser Tyr Gly Thr Ser Pro Thr Asp
50 55 60
Ala Ser Arg Tyr Thr Asn Ala Arg Arg Thr Asn Val Lys Thr Lys Glu
65 70 75 80
Ser Asp Ile Leu Gly Leu Asp Gly Asp Glu Thr Ile Pro Leu Gln Cys
85 90 95
Gln Ala Asn Asp Val Ala Asp Asp Gly Thr Val Val Gly Ala Tyr His
100 105 110
Asp Gln Pro Ala Ile Trp Thr Lys Ala Gly Gly Trp Lys Tyr Leu Pro
115 120 125
Ile Pro Lys Gly Trp Thr Thr Gly Phe Ala Ser Ala Val Thr Pro Asp
130 135 140
Gly His Tyr Ala Val Gly Arg Met Phe Ser Tyr Ser Gly Asn Ala Glu
145 150 155 160
Asn Tyr Gly Glu Tyr Pro Met Leu Trp Asp Leu Thr Thr Met Gln Ile
165 170 175
Thr Glu Thr Pro Gly Tyr Pro Thr Val Gly Ser Ala Gly Glu Lys Ala
180 185 190
Arg Met Ile Arg Tyr Asp Ala Ile Ser Ser Asp Ser Arg Tyr Ile Thr
195 200 205
Gly Ile Val Asp Phe Ser Tyr Thr Trp Asn Thr Leu His Phe Ile Tyr
210 215 220
Asp Arg Gln Asn Glu Ser Tyr Thr Thr Met Gly Phe Asn Thr Asp Gly
225 230 235 240
Thr Pro Trp Thr Glu Gly Leu Leu Gly Val Glu Gly Thr Leu Ser Pro
245 250 255
Asn Gly Lys Trp Phe Gly Gly Thr Ala Phe Ile Gln Asn Ala Thr Asn
260 265 270
Pro Asn Asp Glu Tyr Ser Val Pro Cys Arg Tyr Asn Met Glu Thr Asn
275 280 285
Glu Phe Glu Met Phe Asn Glu Leu Glu Ala Arg Asp Tyr Gly Ser Ile
290 295 300
Thr Ile Asp Asn Thr Gly Thr Ile Tyr Ser Ala Thr Pro Ser Ser Thr
305 310 315 320
Pro Ile Arg Ser Val Tyr Ile Arg Ala Gly Lys Phe Trp Tyr Ala Leu
325 330 335
Asp Glu Leu Met Ser Gln Ser Tyr Gly Ile Asp Phe Tyr Gly Lys Thr
340 345 350
Gly Leu Asp Asn Thr Gly Thr Ile Met Ser Val Ser Ala Asp Gly Lys
355 360 365
Val Ile Thr Ala Phe Pro Asp Pro Tyr Lys Ser Tyr Ile Leu Glu Leu
370 375 380
Asp Glu Thr Leu Thr Asp Ala Ala Gly Arg Val Asn Leu Leu Asn Asn
385 390 395 400
Tyr Thr Val Thr Pro Ala Asn Gly Ala Ser Phe Ser Gln Met Lys Asp
405 410 415
Val Ser Ile Lys Phe Ser Arg Asp Ile Lys Ile Leu Gly Lys Thr Ser
420 425 430
Asp Ile Lys Phe Thr Asp Asp Ser Gly Ala Ser Val Gly Arg Ile Ile
435 440 445
Thr Phe Ala Val Ser Pro Ser Ser Ser Lys Thr Ala Arg Ile Ala Phe
450 455 460
Arg Thr Leu Asn Leu Thr Ala Gly Lys Lys Tyr Thr Leu Thr Ile Pro
465 470 475 480
Ala Gly Thr Ile Ala Leu Ser Ala Asp Glu Thr Arg Leu Asn Asp Glu
485 490 495
Ile Val Ile Thr Tyr Thr Gly Arg Gly Thr Glu Pro Val Lys Val Val
500 505 510
Thr Ala Val Pro Glu Ser Gly Ser Ala Leu Ser Gln Leu Asn Val Thr
515 520 525
Thr Asn Pro Ile Leu Leu Thr Phe Asp Thr Asn Ile Ser Leu Thr Glu
530 535 540
Asn Ala Ala Ala Thr Leu Tyr Arg Asp Gly Ser Asp Asp Ile Val Ser
545 550 555 560
Pro Leu Ser Val Ala Val Lys Asp Asn Gln Met Leu Ile Tyr Pro Glu
565 570 575
Thr Thr Gln Tyr Leu Tyr Leu Asn Thr Asn Tyr Lys Val Val Leu Asn
580 585 590
Ala Gly Ser Val Thr Asp Val Asn Gly Gly Asn Ala Asn Glu Arg Tyr
595 600 605
Glu Ile Met Tyr Glu Gly Ile Tyr Glu Arg Ile Val Val Ala Asp Asp
610 615 620
Thr Leu Ile Tyr Lys Glu Asp Phe Ala Asn Gly Val Gly Gly Met Met
625 630 635 640
Leu Tyr Asp Gly Asp Gly Asn Ile Pro Asn Glu Glu Met Lys Asp Tyr
645 650 655
Asp Phe Tyr Tyr Asn Asn Thr Pro Gln Pro Trp Val Pro Val Arg Glu
660 665 670
Ser Arg Glu Ser Asp Asp Tyr Ser Ala Ala Ser Thr Ser Ala Tyr Ser
675 680 685
Pro Ala Gly Lys Ser Asp Asp Trp Met Val Thr Pro Gln Ile Tyr Ile
690 695 700
Pro Asp Ala Lys Cys Arg Leu Glu Phe Asn Gly Gln Gly Phe Arg Lys
705 710 715 720
Tyr Lys Gln Asp Lys Leu Lys Val Ile Val Tyr Ala Ser Asp Lys Val
725 730 735
Leu Asn Tyr Phe Ser Lys Asp Asn Ala Asp Glu Phe Arg Ala Asp Gly
740 745 750
Asn Val Ile Met Asp Glu Ile Leu Ser Pro Gly Asn Ser Glu Asp Asn
755 760 765
Leu Ser Asn Glu Trp Thr Thr Tyr Ser Phe Lys Leu Asp Lys Tyr Ala
770 775 780
Gly Lys Asn Ile Tyr Val Ala Phe Ile Asn Glu Asn Glu Asp Gln Ser
785 790 795 800
Ile Val Phe Val Asp Asn Ile Lys Val Val Arg Asp Asn Gly Phe Leu
805 810 815
Thr Ala Leu Thr Ser Ala Thr Thr Val Val Gly Gln Asn Ser His Lys
820 825 830
Ile Glu Gly Arg Val Thr Ala Asn Ser Glu Thr Glu Thr Tyr Thr Thr
835 840 845
Ala Asn Ile Lys Leu Leu Asp Ser Glu Lys Asn Val Val Asp Glu Ile
850 855 860
Ser Glu Asn Gly Leu Ser Leu Lys Lys Gly Asp Arg Tyr Asp Phe Ala
865 870 875 880
Phe Ala Lys Asn Leu Pro Leu Thr Ile Gly Glu Ile Asn Val Phe Tyr
885 890 895
Ile Arg Val Gln Leu Asp Glu Lys Phe Asp Thr Ile Ser Tyr Ala Ile
900 905 910
Lys Asp Leu Ala Phe Gln Pro Thr Lys Arg Val Ile Val Glu Glu Met
915 920 925
Thr Gly Gln Asp Cys Gly Asn Cys Pro Arg Gly His Leu Ala Trp Glu
930 935 940
Asn Leu Glu Arg Val Tyr Gly Asp Arg Val Ile Leu Ala Gly Tyr His
945 950 955 960
Val Tyr Thr Gly Asp Ile Tyr Glu Ser Gly Met Ser Ser Tyr Val Asn
965 970 975
Gln Phe Leu Gly Leu Ala Gly Ala Pro Ser Ala Lys Val Gln Arg Gly
980 985 990
Glu Thr Ile Gly Ser Pro Thr Tyr Ala Ser Ile Thr Ala Gly Arg Thr
995 1000 1005
Ser Tyr Ser Phe Thr Ser Pro Leu Gly Asp Cys Trp Phe Asp Leu
1010 1015 1020
Val Gln Lys Glu Phe Asp Thr Asp Ala Asp Ala Asn Leu Asp Val
1025 1030 1035
Ile Ala Tyr Phe Asp Glu Ala Thr Gln Lys Val Lys Ala Thr Ala
1040 1045 1050
Ser Ala Lys Phe Ala Met Asn Ile Ser Lys Gln Asn Ile Gly Leu
1055 1060 1065
Phe Ile Ile Val Thr Glu Asp Gly Leu Pro Gly Phe Gln His Asn
1070 1075 1080
Tyr His Tyr Asn Asp Asp Ala Asp Gly Leu Gly Glu Trp Gly Lys
1085 1090 1095
Gly Gly Thr Leu Gly Gln Glu Tyr Val Val Tyr Thr His Asn Asp
1100 1105 1110
Val Ala Arg Ala Gln Val Gly Ser Tyr Tyr Gly Thr Thr Gly Tyr
1115 1120 1125
Ile Pro Ser Thr Ile Ala Ser Asn Glu Thr Tyr Thr Ala Asn Ile
1130 1135 1140
Glu Phe Ala Lys Pro Ala Val Asn Glu Ile Tyr Asn Ser Asn Val
1145 1150 1155
Ile Cys Met Met Ile Asn Ala Asn Thr Gly Ala Val Ile Asn Val
1160 1165 1170
Ala Lys Ser Lys Ile Ala Lys Ala Ser Gly Ile Glu Gly Ile Thr
1175 1180 1185
Thr Ser Gly Thr Asp Ala Thr Glu Lys Ile Arg Tyr Asn Ala Ala
1190 1195 1200
Gly Gln Ile Ile Thr Ala Pro Val Lys Gly Leu Asn Ile Ile Arg
1205 1210 1215
Met Gly Asp Gly Ser Ile Arg Lys Val Val Val Lys
1220 1225 1230
<210> 15
<211> 3084
<212> DNA
<213> Prevotella rodentium
<400> 15
atgttttcat ttatgggcat cggacaggcc atagccgatg ttgtgggttc gggatcaaaa 60
gaagatccgt acgtgcttga aaacggcaca acactgacat tgaaagcata ccagtctttc 120
tatgccaaat tcacggcccc ggctgacggg atcttctcat tatatacaaa agacagctat 180
gccctatata ccgacgagag tttcagtcag attgacgagt cggcaaacat tacattcaac 240
ggaagctaca gtgagacggc atactacttt aactgtaaag caggtgttac ctattatata 300
ggaaacagtt ttgtaatggt cggaggtacg gttacagcca aattcaacac agaggctgaa 360
ccattggttc tgcgtgaaat atcacctgag gccggttctg ttttcaatgc aggcaaagga 420
agcgtagaac tgacattcaa ccaaaaagtc caaatagctt ctgttactgt caatgctggc 480
acatcctcaa agacaatgtc agccaatgta aatggagtat atgtatctgt agacgtaaaa 540
gactggctga accaaatgta tgatgaaggt aaggtcaaag aaggaaatga gatcagcttc 600
aagttcaaag gcgtggcccc aaccatttct ccaagcaaat tatacaacgg caatggagag 660
ttggacatta catacacagc aggcaagaaa cccctgcaac ttgtaagcag taccaacacc 720
ccaatgagca ctcctgccgt aaccacattc aagtcattct atacatccga tgacaaccaa 780
ggcatcgtaa cactggaatt cagtgacgaa gtgaactttt cggaaggcaa caaaccaacc 840
gcaaccctta catatggaaa tatggaccaa gatgccgacg gcgaatacta cacagaatca 900
ctccccatac ttccacttgg aagcaacatt ttaatggtta atctcaagga caagctgcgc 960
cgtgcacagg acatggtgac ttccggcact gtttacgacg gaataacact atccatatca 1020
cacgtaaagg atatggacgg aaactatgcc tatggctcgg gctccggtgt actcggatcg 1080
ttcgcattca actacaaact atctgaaata aactataccg ttgatacaga ctgggcacta 1140
cttgacgcaa ccggcaatcc gacaaacaaa gacgtcatag attcagacac caaaagcata 1200
gaactatgga tgagcgaaaa cgacggtcag ataacattca acggagtaga attcaaatat 1260
accgaaagcg gtactgaaaa ggtcaagaca atgaccctca gcgagattaa agttgacaac 1320
aatggaaacg aaacaacaat aacaatcccc gttcccaaca tcgcagccga tgccggtagc 1380
gacatcgcca tatctctcaa agacgtggaa cgccccgacg gtattaccac cagtatcgat 1440
ccggccgcac taagctattt cacaaaaaca ttcacaacaa caggcataac tgaaagcaaa 1500
ttcgacatta caagtgccat atggatgtat gaagagaatg aaggcaatat tgtcgaagtc 1560
aatatgataa acggtaatat cggtgtactg acaagaggca gcaaatctgt aatcaaaacc 1620
aacaaagaca atgagattgg ttatgtagaa tgggaaatca gaggcgttga caatcctgat 1680
atggaatata tcagagctgg atatgtagat acaaccatga caggcaagtt ggtagacggc 1740
tttaccatag aatggagagg agaaggcttg acagccggca aagactacac tttcacgctt 1800
caggcatgga agaatgaagc cgacaaaaac agtggtgccg agcctaatgt aggtgaagca 1860
atgttcatca tacacggtac aaaacaagcc tacatttaca gcgatgttgt gatgaagact 1920
gatataagcc agcctatcag acttgcttct gccaatgaca attaccgtac catcgagttc 1980
agcgctcctg taacgttaaa tgccgtggta aaccttggca tggggacatc ggcagattgc 2040
accgttgaac cctcggccga ccgtactgca tggacagtca caatacctga atatgtaatg 2100
tcacaatatg gagagttcag cgtgaatgtc tttgcaaaag atgacgaagg gcgtgcagta 2160
aacaagactg agaacggact gggtataata atggggacag aagacaatac gtggttccag 2220
attgattttg tcagtgagag ccttgccccg gactttacag taacaccggc aaacgaatca 2280
gtgctcgaga gtctcagcac cgtcactttc ggttatgaag gcagtatcag catcaactgg 2340
aacaacagtg agaaaatcac catatataac agaacaacac gtgagaagat agcagagttc 2400
agcggtgatg acgtcgtgtt ggatgaagat ccggatgact attgggctcc aatactctca 2460
tgccacatca cacttcccga gcctgtcact gctattggag tctacgatgt aaatgttcct 2520
gccggtttct tcgttcttgg tgagcagttc gacagtaact taagtaaggc aacaagcata 2580
gtatacgaaa taaaggaacc ggttgagccg ttcggaatag aaataagccc tactgccgga 2640
ttggtaagcg agattccgtc aaaactcata gtaacggtaa cggacagaag cagttgtaac 2700
tttactgcca accccacatt gaccgacaat accggtaaca gctatcccgt acacaacgat 2760
ttcgactggg gtatcccgga aatgaacaag ttcgtcatca ttcttgacaa cggagccatc 2820
acggctgacg gaatatacac actcactatt cctgccggct cgattatagg tgacgacgag 2880
acagacctca acaaggaaga ctttgtattc atatatacaa tcaatctggc cggaatcaca 2940
gaactcgtca acaatgaggg cggaaaggtt gacgtataca ctctcaaagg aacattgttg 3000
atgaaagacg ccgatgcatc agcagtaaac aagctggcaa aaggcatgta cataatcaat 3060
ggtaaaaaag tgttcataag ataa 3084
<210> 16
<211> 1212
<212> PRT
<213> Prevotella muris
<400> 16
Met Lys Lys Thr Ile Ala Leu Phe Val Trp Ser Phe Ile Phe Ile Leu
1 5 10 15
Gly Leu Ser Ala Gln Ser Glu Ser Pro Val Leu Lys Thr Leu Glu Ile
20 25 30
Pro Phe Gly Thr Ile Cys Gly Met Ser Asp Asn Gly Arg Trp Ala Val
35 40 45
Tyr Asp Asp Gly Asn Asp Gly Val Glu Arg Gly Cys Ser Val Asp Val
50 55 60
Tyr Asp Leu Ala Thr Gly Thr Ile Val Thr Leu Ser Glu Gly Thr Glu
65 70 75 80
Gly Asp Asn Ile His Leu Asn Asp Ile Ser Asp Asp Gly Lys Ile Val
85 90 95
Ala Gly Ser Tyr Asn Asn Val Pro Ala Tyr Phe Lys Asp Gly Lys Trp
100 105 110
Thr Val Leu Pro Leu Pro Ala Gly Thr Arg Gly Tyr Gly Gly Arg Val
115 120 125
Ser Ser Met Thr Pro Asp Gly Ser Val Met Val Gly Met Ile Tyr Asn
130 135 140
Asn Ser Phe Asn Phe Thr Ser Cys Tyr Trp Lys Asp Gly Asn Leu Ile
145 150 155 160
Thr Leu Glu Gly Leu Pro Thr Thr Asp Ile Ser Gly Lys Glu Asn Ala
165 170 175
Glu Leu Asn Ile Val Ala Val Ser Ala Asp Gly Asn Ile Val Leu Gly
180 185 190
Gly Leu Ser Thr Asn His Pro Gly Trp Gly Cys Cys Tyr Phe Val Tyr
195 200 205
Asp Val Arg Thr Lys Thr Tyr Glu Met Leu Gly Gln Asn Leu Thr Ser
210 215 220
Glu Ser Phe Ile Asp Asn Val Val Met Ser Asn Asn Gly Ser Phe Val
225 230 235 240
Gly Gly Asn Ala Tyr Ile Val Arg Glu Val Glu Gly Ser Glu Trp Pro
245 250 255
Glu Glu Leu Ile Val Pro Phe Leu Tyr Asp Val Asn Lys Lys Thr Phe
260 265 270
Glu Ile Tyr Glu Ser Thr Ala Asp Asn Asp Gln Phe Val Ser Ala Val
275 280 285
Ala Asn Asp Gly Leu Met Phe Cys Ala Thr Pro Tyr Gly Asn Pro Val
290 295 300
Arg Thr Met Ser Val Arg Val Asp Gly Lys Tyr Ile Asp Leu Gly Met
305 310 315 320
Ile Leu Lys Gln Arg Tyr Ser Ile Asp Phe Thr Lys Ala Met Gly Val
325 330 335
Glu Tyr Thr Gly Thr Ala Val Ala Val Ser Asp Asp Gly Lys Thr Ile
340 345 350
Val Ala Phe Pro Ser Pro Gln Asp Asp Asn Tyr Ala Leu Thr Leu Pro
355 360 365
Val Ser Phe Ser Glu Ala Ala Gln Gly Val Asn Ile Leu Ala Glu Tyr
370 375 380
Glu Leu Ser Pro Val Ser Gly Ser Ser Ile Ala Lys Ile Arg Gln Val
385 390 395 400
Ala Leu Ser Phe Thr Arg Glu Ala Thr Val Ala Glu Gly Ala Glu Ala
405 410 415
Tyr Ile Tyr Lys Gly Glu Thr Lys Leu Ala Val Thr Asn Ser Ile Ser
420 425 430
Ala Ala Ser Val Lys Asp Asn Arg Ile Phe Ile Leu Asp Phe Pro Glu
435 440 445
Thr Ala Phe Ser Glu Asp Glu Glu Tyr Val Leu Lys Ile Pro Ala Gly
450 455 460
Ile Phe Val Asn Gly Asn Met Arg Asn Asn Glu Ile Thr Ala Val Tyr
465 470 475 480
Lys Gly Arg Ala Asp Arg Pro Val Ala Pro Thr Arg Ile Ala Pro Leu
485 490 495
Glu Gly Ser Val Ile Thr Glu Leu Ser Tyr Asn Asn Pro Val Met Ile
500 505 510
Thr Phe Asp Thr Lys Ile Ala Val Ala Gln Ala Val Thr Gly Ile Leu
515 520 525
Phe Gln Ser Gly Ser Glu Gln Pro Leu Ser Ser Leu Val Leu Val Ala
530 535 540
Asp Gly Asn Lys Leu Tyr Val Tyr Pro Pro Thr Thr Arg Arg Leu Met
545 550 555 560
Lys Gly Ser Asp Tyr Val Val Lys Ile Pro Ala Gly Ala Val Ser Asp
565 570 575
Ile Val Gly Phe Cys Pro Ser Asn Glu Ile Thr Val Asn Tyr His Gly
580 585 590
Ala Phe Val Val Thr Pro Pro Glu Gln Gly Ser Ser Phe Leu Phe Val
595 600 605
Asp Asp Phe Asn Asp Pro Gly Thr Ser Leu Ser Lys Trp Leu Met Tyr
610 615 620
Glu Gly Asp His Asn Thr Pro Thr Ser Glu Met Val Gly Trp Gly Phe
625 630 635 640
Asp Ala Asp Asn Thr Pro Trp Asn Phe Ser Val His Asp Asn Gly Gln
645 650 655
Tyr Asp Tyr Cys Ala Ala Ser Thr Ser Met Tyr Ser Pro Ala Gly Lys
660 665 670
Ala Asn Asp Trp Met Ile Thr Pro Gln Leu Ser Ile Gly Asn Glu Tyr
675 680 685
Tyr Arg Leu Asn Phe Gln Thr Gln Ser Tyr Lys Ala Gly Lys Thr Asp
690 695 700
Lys Leu Lys Val Tyr Val Trp Glu Asn Asn Glu Leu Ile Glu Gly Thr
705 710 715 720
Val Leu Lys Glu His Ile Asp Ala Ile Arg Ala Asn Gly Lys Leu Val
725 730 735
Phe Asp Gly Ile Leu Thr Pro Gly Ala Asn Glu Ala Thr Leu Thr Gly
740 745 750
Glu Trp Glu Ser His Ser Ile Ser Leu Ala Glu Tyr Ala Gly Lys Lys
755 760 765
Val Tyr Ile Ala Phe Leu Asn Glu Asn Glu Asp Gln Ser Val Val Phe
770 775 780
Val Asp Asn Val Asn Val Ile Tyr Lys Gly Asn Phe Val Ala Gly Ser
785 790 795 800
Met Thr Gln Glu Asn Val Val Ser Gln Glu Glu Val Glu Ile Lys Gly
805 810 815
Tyr Val Met Val Thr Asn Asn Ile Ser Tyr Asp Ala Ile Glu Ala Thr
820 825 830
Tyr Lys Thr Val Asp Gly Thr Gln Ile Gly Ile Tyr Lys Ala Glu Gly
835 840 845
Ile Gly Leu Lys Glu Gly Ser Val Tyr Glu Phe Thr Phe Pro Glu Lys
850 855 860
Leu Lys Leu Thr Lys Gly Gln Glu Thr Gln Phe Thr Ile Val Val Lys
865 870 875 880
Met Gly Thr Glu Glu Gln Glu Ile Asn Ala Ser Val Lys Asn Leu Ala
885 890 895
Phe Asn Pro Lys Arg Arg Val Val Leu Glu Glu Gly Thr Gly Ala Trp
900 905 910
Cys Gly Asn Cys Pro Gly Gly Ile Leu Ala Val Glu Tyr Ile Glu Ser
915 920 925
Gln Phe Pro Gly Gln Leu Ile Pro Val Cys Ile His Asn Asp Asp Ala
930 935 940
Tyr Ala Tyr Gly Ala Tyr Glu Asn Phe Leu Gly Phe Arg Ser Phe Pro
945 950 955 960
Thr Gly Arg Val Asn Arg Ile Glu Asn Tyr Leu Ser Tyr Met Asp Ser
965 970 975
Asp Glu Glu Gly Asn Pro Ser Phe Phe Ser Val Asp Gly Asp Val Thr
980 985 990
Phe Ala Asp Tyr Val Lys Ser Glu Ile Asp Lys Leu Thr Asp Val Glu
995 1000 1005
Ile Glu Thr Gly Glu Ala Val Tyr Asn Ser Ser Thr Asp Glu Ile
1010 1015 1020
Thr Val Pro Val Asp Val Arg Phe Ala Leu Asp Lys Asn Ser Val
1025 1030 1035
Asn Tyr Asn Ile Leu Thr Val Val Val Glu Asp Asn Leu Tyr Ala
1040 1045 1050
Asn Gln Arg Asn Tyr Phe Ala Ser Arg Thr Asp Pro Ile Tyr Gly
1055 1060 1065
Asp Trp Gly Leu Gly Gly Lys Tyr Gly Arg Gly Ser Val Trp Tyr
1070 1075 1080
Ala Tyr Lys Asp Val Ala Arg Ala Ile Val Gly Gln Ser Tyr Tyr
1085 1090 1095
Gly Glu Asn Gly Tyr Ile Pro Thr Ser Val Arg Gly Gly Glu Thr
1100 1105 1110
Ile Thr Gly Asn Val Ser Phe Glu Val Pro Gly Lys Ser Ile Thr
1115 1120 1125
Ser Leu Gln Asn Cys Lys Ile Val Cys Met Leu Ile Asp Ala Ala
1130 1135 1140
Asn Gly Tyr Val Leu Asn Ala Ala Arg Cys Asp Ala Ile Ser Asn
1145 1150 1155
Ala Leu Glu Trp Asp Leu Ala Asn Gly Ile Glu Asp Ala Val Val
1160 1165 1170
Asp Thr Asp Thr Ala Gly Val Glu Asn Tyr Asn Leu Ala Gly Gln
1175 1180 1185
Lys Met Asn Ala Gln Arg Lys Gly Leu Asn Ile Val Arg Leu Ser
1190 1195 1200
Asn Gly Arg Thr Val Lys Val Val Lys
1205 1210
<210> 17
<211> 3639
<212> DNA
<213> Prevotella muris
<400> 17
atgaaaaaaa ctattgcatt atttgtatgg agtttcattt tcatattagg gctttccgcc 60
caaagtgaat ctcctgtatt gaaaactctt gaaattccat tcggtactat ttgcgggatg 120
tcagacaacg gccgttgggc tgtttatgac gatggcaatg acggtgttga gcggggctgt 180
tctgtggatg tctatgactt ggctacagga accattgtaa cactctctga aggaacggaa 240
ggggataata ttcatttgaa tgatatttcg gatgacggta agatagtggc aggttcatac 300
aacaatgttc ccgcttactt taaggatgga aagtggacgg tacttccttt accggccggc 360
acacgtggat atggcggaag agtttcgagt atgactccgg atggttcggt aatggtcgga 420
atgatttata ataactcctt taattttacg tcatgctatt ggaaagacgg aaatcttatc 480
actcttgaag gattgcctac aaccgatata tccggtaagg aaaatgcaga attgaatatt 540
gttgctgttt cggctgacgg caatattgtt ctcggaggat tgtcaaccaa tcatccggga 600
tggggatgct gttattttgt atatgatgtc aggacaaaga cttacgagat gcttggtcag 660
aatcttacct cagagtcatt cattgataat gttgtgatga gcaataacgg aagttttgtg 720
ggtggtaatg catatatagt gcgtgaggtt gaaggttcgg aatggccgga ggagcttatt 780
gttcctttcc tttatgatgt gaataaaaaa acgtttgaaa tatatgaaag tacagctgat 840
aatgaccagt ttgtttcagc tgtggcaaat gacggtttga tgttctgtgc cactccgtac 900
ggtaatcctg tgcgtacgat gagtgtgcgt gtggatggaa aatatattga tttgggaatg 960
atactcaaac aacgctattc gatagatttt acaaaggcaa tgggtgttga gtatacggga 1020
actgccgtag cggtgagtga cgatggtaag acaatagtgg ctttcccttc tccacaagat 1080
gataattatg ccttgactct ccctgtttca ttctctgagg ccgcacaagg tgttaatatt 1140
cttgcggaat atgagctttc tcctgtgtca ggaagcagta tcgcaaaaat aaggcaggtc 1200
gctttatctt ttacgcgtga ggccaccgta gctgaaggtg ccgaggcata tatttataaa 1260
ggtgaaacaa agttggcagt cacaaattcc ataagtgcgg cgtctgtaaa agataatcgt 1320
atatttatac ttgattttcc tgagacagcc ttttcggaag atgaagagta tgtgcttaag 1380
atacctgcag gtattttcgt aaatggaaat atgagaaata atgaaatcac ggctgtatat 1440
aaaggacgtg cagacagacc ggtagctcca acacgcatag ctccgttgga aggaagcgta 1500
ataacagagt taagctacaa caatcctgtc atgattactt tcgacacgaa gattgctgtg 1560
gctcaggcag tcacaggcat tcttttccag agtggttcgg aacagccgtt gagcagtctt 1620
gtacttgttg ccgatggaaa caaactatac gtatatcctc ctacaacgcg tcgtttgatg 1680
aaaggttccg actatgtggt gaagatacct gccggtgctg tttccgatat agtagggttc 1740
tgccccagta atgagattac tgtcaattat catggggctt ttgttgtaac acctcccgag 1800
cagggttcaa gcttcctgtt tgttgatgac ttcaatgatc cgggaacgtc actgagcaaa 1860
tggcttatgt acgagggtga ccataacaca ccgacctcag aaatggtagg atggggattt 1920
gatgcagaca atacgccatg gaatttctct gtgcatgaca acggacagta tgattattgt 1980
gccgcatcta cttccatgta ttctccggcc ggcaaggcaa acgactggat gataacccca 2040
cagctttcta taggaaacga atattacaga ctgaatttcc agacgcagtc atataaggcc 2100
ggtaagactg ataaactgaa agtttatgta tgggagaaca atgagcttat agaaggtact 2160
gtgctgaaag agcacattga tgcaatacgt gcaaacggaa agctggtatt tgacggtata 2220
ctgactccgg gagcaaatga agctacgctt accggtgaat gggaaagcca cagcatttca 2280
cttgcagaat atgcaggaaa gaaagtgtat atagcattcc tcaacgaaaa cgaagaccaa 2340
agtgtagtgt ttgttgataa cgtaaatgta atatacaaag gcaactttgt tgccggaagc 2400
atgacacagg agaatgtcgt ctcacaggaa gaggtggaaa tcaagggata tgtaatggtg 2460
accaacaaca tatcatacga tgccattgag gccacatata agacggtaga cggaacccaa 2520
ataggaattt ataaggctga gggtatcggt ttgaaggagg gcagtgttta tgaatttacg 2580
ttccctgaaa aactgaagtt gaccaaaggg caggaaacgc agtttactat agtggtgaag 2640
atgggtacgg aagaacagga gataaatgct tcggttaaga atcttgcgtt caatcccaag 2700
cgtcgtgtcg tacttgaaga aggcaccggc gcatggtgtg gaaattgtcc tggaggtata 2760
cttgctgtgg aatatattga aagtcagttc cccggtcagt tgatacctgt atgcatacat 2820
aatgacgatg cgtatgctta tggtgcgtat gagaatttcc ttggtttccg ctcgttccct 2880
acaggtcgtg taaaccgtat tgagaattat ctttcatata tggactcaga tgaggaaggc 2940
aatccgtctt tcttctctgt tgacggagat gtaacgtttg ctgactatgt gaagagtgag 3000
atagacaaac ttacagatgt tgaaatagag acaggcgagg ctgtctataa ttcgtctacg 3060
gatgagataa ctgtgccggt cgatgtaagg tttgcccttg acaagaattc tgtaaattat 3120
aatattctta ctgttgttgt cgaggataat ctatatgcta atcagcgtaa ttattttgca 3180
agccgtacgg atcctattta tggagactgg gggcttggcg gtaagtatgg ccgtggttcc 3240
gtatggtatg cttataagga tgtcgcacgt gcaatcgtgg gtcagtctta ttatggtgag 3300
aacggatata ttcccacttc tgtcagaggc ggtgagacaa taactggtaa tgtttctttt 3360
gaagttcccg gcaaatcgat tacttcactg cagaattgca agattgtatg tatgctgatt 3420
gatgcggcta acggatacgt tcttaatgca gcacggtgtg atgcaatatc taacgcattg 3480
gaatgggatc ttgcgaatgg aatagaggat gctgttgttg atacagatac agcaggtgtg 3540
gagaattaca atctggcagg ccagaaaatg aatgctcagc gcaaaggtct taatatcgta 3600
aggctttcca acggacgtac ggtgaaagtc gtaaagtga 3639
<210> 18
<211> 1027
<212> PRT
<213> Paraprevotella clara
<400> 18
Met Arg Glu Lys Leu Tyr Thr Lys Trp Lys Gly Gly Leu Asn Arg Leu
1 5 10 15
Cys Phe Leu Leu Met Cys Phe Cys Trp Thr Thr Val Gln Ser Trp Ala
20 25 30
Val Gly Glu Asp Leu His Leu Thr Ile Glu Asn Gly Lys Thr Tyr Glu
35 40 45
Phe Glu Ala Phe Asn Ser Tyr Tyr Leu Thr Tyr Val Ala Thr Ala Asn
50 55 60
Gly Gln Leu Ser Leu Tyr Gln Thr Gly Gly Asp Phe Cys Arg Gln Tyr
65 70 75 80
Thr Asp Asn Thr Phe Glu Thr Glu Leu Pro Ser Thr Pro Gln Tyr Val
85 90 95
Asn Glu Gly Lys Leu Val Glu Val Lys Val Glu Ser Gly Lys Thr Tyr
100 105 110
Tyr Phe Leu Thr Arg Gly Leu Ser Lys Gly Glu Leu Thr Val Thr Phe
115 120 125
Gly Glu Lys Ala Thr Pro Leu Glu Leu Leu Ser Leu Ser Lys Glu Glu
130 135 140
Gly Thr Thr Leu Asn Leu Ser Ile Asp Thr Leu Leu Gly Phe Thr Phe
145 150 155 160
Asn Arg Met Val Lys Val Gly Asn Cys Thr Leu Ser Ser Gly Ser Val
165 170 175
Ile Gly Asn Leu Thr Ala Ser Thr His Asp Tyr Gly Phe Thr Val Ser
180 185 190
Ile Lys Asp Val Leu Tyr Lys Trp Leu Lys Glu Gly Asn Val Lys Ala
195 200 205
Gly Asp Glu Val Val Leu Thr Val Thr Gly Leu Cys Asn Ala Asn Asp
210 215 220
Glu Ser Asp Lys Tyr Asn Gly Asn Gly Val Leu Thr Val Lys Tyr Ile
225 230 235 240
Ala Gly Ala Leu Pro Ala Glu Leu Val Ser Val Ala Asn Ser Pro Glu
245 250 255
Glu Met Asn Phe Leu Ser Tyr Tyr Leu Pro Thr Asp Glu Arg Gly Leu
260 265 270
Val Thr Leu Thr Phe Asn Arg Ser Met Gly Glu Asp Ala Thr Ala Lys
275 280 285
Leu Phe Tyr Gly Asp Ile Glu Gly Ser Asn Val Tyr Thr Glu Ser Leu
290 295 300
Pro Val Lys Val Lys Asp Thr Gln Leu Ile Val Asp Leu Arg Gly Lys
305 310 315 320
Arg Arg Leu Pro Ser Asp Met Leu Pro Asn Ala Ser Ala Asp Val Trp
325 330 335
Glu Gln Tyr Lys Thr Ile Ser Leu Lys Val Ser Asn Val Arg Asp Ala
340 345 350
Glu Gly Asn Tyr Ala Met Ser Thr Ser Gln Gly Thr Thr Gly Ser Tyr
355 360 365
Ala Phe Asn Tyr Thr Ile Glu Ala Val Glu Phe Asp Ile Ala Arg Gln
370 375 380
Phe Thr Pro Asp Glu Gly Thr Ser Leu Asp Asp Tyr Pro Thr Ile Glu
385 390 395 400
Leu Trp Ile Arg Glu Asp Asp Ser Phe Thr Tyr Glu Gly Val Asp Phe
405 410 415
Thr Tyr Ser Val Asn Gly Lys Pro Glu Thr Val Phe Val Pro Ile Ser
420 425 430
Gln Ile Thr Lys Lys Asp Ser Pro Asp Gly Asp Gly Val Leu Leu Thr
435 440 445
Ile Pro Val Pro Asn Lys Ala Ala Asp Glu Asn Ser Asp Val Val Val
450 455 460
Ser Leu Asn Asn Leu Lys Ala Ala Asp Gly Ala Asp Tyr Ser Glu Asp
465 470 475 480
Phe Thr Ile Ile Tyr Lys Thr Ser Gly Lys Ser Met Ala Gly Leu Glu
485 490 495
Ile Glu Ser Val Ser Pro Ala Asp Gly Ala Ala Ile Ala Ala Leu Glu
500 505 510
Ala Gly Ser Phe Leu Glu Leu Ser Thr Asn Met Asn Asp Arg Val Gly
515 520 525
Tyr Val Trp Phe Arg Ile Asp Asp Gln Asn Pro Lys Asp Pro Asp Gln
530 535 540
Ala Cys Val Lys Thr Met Thr Ser Met Lys Lys Glu Thr Val Glu Gly
545 550 555 560
Lys Val Ser Phe Arg Thr Glu Ile Ile Arg Ser Val Thr Phe Phe Glu
565 570 575
Gly His Thr Tyr Lys Val Thr Phe Asn Ala Tyr Ala Ser Glu Ser Asp
580 585 590
Tyr Gln His Gly Ala Asp Ala Leu Gly Val Met Thr Val Thr Tyr Thr
595 600 605
Gly Thr Thr Pro Glu Phe Lys Phe Ser Pro Val Lys Phe Val Gly Ile
610 615 620
Thr Pro Asp Pro Asp Tyr Thr Val Ile Ser Asp Val Ser Gln Asn Glu
625 630 635 640
Phe Thr Leu Thr Phe Asp Gly Pro Val Val Leu Asn His Glu Asn Ala
645 650 655
Phe Val Val Tyr Gly Gln Gly Met Asn Tyr Pro Phe Glu Ser Ile Glu
660 665 670
Ser Asn Glu Asp Gly Thr Glu Trp Thr Val Lys Ala Ser Leu Glu Lys
675 680 685
Leu Met Glu Met Gly Met Met Pro Arg Leu Ser Met Ser Phe Met Pro
690 695 700
Val Asp Lys Asp Gly Met Leu Val Glu Gly Asn Ala Gly Lys Asp Ala
705 710 715 720
Thr Ser Tyr Leu Asn Phe Ala Tyr Asp Cys Thr Ile Gly Ile Pro Asp
725 730 735
Leu Gln Val Ser Pro Glu Ser Gly Ser Lys Val Glu Ser Leu Lys Glu
740 745 750
Ile Ile Val Gly Cys Lys Asp Ile Met Asp Asp Asn Gly Glu Val Ile
755 760 765
Phe Gln Gly Gly Ile Ser Glu Ser Tyr Met Ala Ala Glu Lys Ile Ile
770 775 780
Leu Tyr Lys Asn Gly Arg Glu Pro Val Ala Thr Val Thr Ser Ile Glu
785 790 795 800
Pro Ile Ile Pro Glu Asp Gln Glu Asp Asn Tyr Phe Tyr Val Pro Val
805 810 815
Glu Met Lys Leu Thr Leu Asp Thr Glu Ile Thr Asp Asn Gly Asn Tyr
820 825 830
Arg Leu His Ile Pro Ala Asn Tyr Phe Ile Leu Gly Thr Gly Met Ser
835 840 845
Asn Val Asn Ser Lys Glu Thr Ser Val Leu Tyr Thr Ile Asp Gln Pro
850 855 860
Ile Lys Ile Thr Val Thr Pro Glu Asn Asn Ser Thr Val Glu Ser Leu
865 870 875 880
Lys Glu Ile Thr Ile Glu Cys Glu Ser Gly Ile Asp Val Pro Ser Met
885 890 895
Gly Thr Ile Gln Leu Leu Asn Ala Gln Asn Glu Val Val Ala Ser Ala
900 905 910
Thr Gly Glu Asp Cys Glu Leu Leu Leu Pro Glu Gly Ala Gly Pro Trp
915 920 925
Asp Pro Tyr Thr Gly Val Thr Ile Lys Leu Asp Gln Glu Val Thr Glu
930 935 940
Lys Gly Thr Tyr Lys Leu Val Ile Pro Glu Gly Phe Phe Tyr Leu Gly
945 950 955 960
Glu Asn Tyr Glu Asn Ser Asp Glu Met Thr Phe Thr Tyr Ser Ile Asn
965 970 975
Ala Ser Gly Ile His Ser Ile Gly Thr Glu Ser Lys Gly Val Val Val
980 985 990
Tyr Thr Val Asp Gly Lys Phe Ile Leu Lys Ser Ser Asp Ala Lys Asp
995 1000 1005
Val Lys Asn Leu Lys Lys Gly Leu Tyr Ile Val Asn Gly Lys Lys
1010 1015 1020
Met Met Val Lys
1025
<210> 19
<211> 1027
<212> PRT
<213> Paraprevotella clara
<400> 19
Met Arg Glu Lys Pro Tyr Thr Lys Trp Lys Gly Gly Leu Asn Arg Leu
1 5 10 15
Cys Phe Leu Leu Met Cys Phe Cys Trp Thr Thr Val Gln Ser Trp Ala
20 25 30
Val Gly Glu Asp Val His Leu Thr Ile Glu Asn Gly Lys Thr Tyr Glu
35 40 45
Phe Glu Ala Phe Asn Ser Tyr Tyr Leu Thr Tyr Val Ala Thr Ala Asn
50 55 60
Gly Gln Leu Ser Leu Tyr Gln Thr Gly Gly Asp Phe Cys Arg Gln Tyr
65 70 75 80
Thr Asp Asn Thr Phe Glu Thr Glu Leu Pro Ser Thr Pro Gln Tyr Val
85 90 95
Asn Glu Gly Lys Leu Val Glu Val Lys Val Glu Ser Gly Lys Thr Tyr
100 105 110
Tyr Phe Leu Thr Arg Gly Leu Ser Lys Gly Glu Leu Thr Val Thr Phe
115 120 125
Gly Glu Lys Ala Thr Pro Leu Glu Leu Leu Ser Leu Ser Lys Glu Glu
130 135 140
Gly Thr Thr Leu Asn Leu Ser Ile Asp Thr Leu Leu Gly Phe Thr Phe
145 150 155 160
Asn Arg Met Val Lys Val Gly Asn Cys Thr Leu Ser Ser Gly Ser Val
165 170 175
Ile Gly Asn Leu Thr Ala Ser Thr His Asp Tyr Gly Phe Thr Val Ser
180 185 190
Ile Lys Asp Val Leu Tyr Lys Trp Leu Lys Glu Gly Asn Val Lys Ala
195 200 205
Gly Asp Glu Val Val Leu Thr Val Thr Gly Leu Cys Asn Ala Asn Asp
210 215 220
Glu Asn Asp Lys Tyr Asn Gly Asn Gly Val Leu Thr Val Lys Tyr Ile
225 230 235 240
Ala Gly Ala Leu Pro Ala Glu Leu Val Ser Val Thr Asn Ser Pro Glu
245 250 255
Asp Met Asn Phe Leu Ser Tyr Tyr Leu Pro Thr Asp Glu Arg Gly Leu
260 265 270
Val Thr Leu Thr Phe Asn Arg Ser Met Gly Glu Asp Ala Thr Ala Lys
275 280 285
Leu Phe Tyr Gly Asp Ile Glu Gly Ser Asn Val Tyr Thr Glu Ser Leu
290 295 300
Pro Val Lys Val Lys Asp Thr Gln Leu Ile Val Asp Leu Arg Gly Lys
305 310 315 320
Arg Arg Leu Pro Ser Asp Met Leu Pro Asn Ala Ser Ala Asp Val Trp
325 330 335
Glu Gln Tyr Lys Thr Ile Ser Leu Lys Val Ser Asn Val Lys Asp Val
340 345 350
Glu Gly Asn Tyr Ala Met Ser Thr Ser Gln Gly Thr Thr Gly Ser Tyr
355 360 365
Ala Phe Asn Tyr Thr Ile Glu Ala Val Glu Phe Asp Ile Ala Arg Gln
370 375 380
Phe Thr Pro Asp Glu Gly Thr Ser Leu Asp Asp Tyr Pro Thr Ile Glu
385 390 395 400
Leu Trp Ile Arg Glu Asp Asp Ser Phe Thr Tyr Glu Gly Val Asp Phe
405 410 415
Thr Tyr Ser Val Asn Gly Lys Pro Glu Thr Val Phe Val Pro Ile Ser
420 425 430
Gln Ile Thr Lys Lys Asp Ser Pro Asp Gly Asp Gly Val Leu Leu Thr
435 440 445
Ile Pro Val Pro Asn Lys Ala Ala Asp Glu Asn Ser Asp Val Val Val
450 455 460
Ser Leu Asn Asn Leu Lys Ala Ala Asp Gly Ala Asp Tyr Ser Glu Asp
465 470 475 480
Phe Thr Ile Ile Tyr Lys Thr Ser Gly Lys Ser Met Ala Gly Leu Glu
485 490 495
Ile Glu Ser Val Ser Pro Ala Asp Gly Ala Ala Ile Ala Ala Leu Glu
500 505 510
Ala Gly Ser Phe Leu Glu Leu Ser Thr Asn Met Asn Asp Arg Val Gly
515 520 525
Tyr Val Trp Phe Arg Ile Asp Asp Gln Asn Pro Lys Asp Pro Asp Gln
530 535 540
Ala Cys Val Lys Thr Met Thr Ser Met Lys Lys Glu Thr Val Glu Gly
545 550 555 560
Lys Val Ser Phe Arg Thr Glu Ile Ile Arg Ser Val Thr Phe Phe Glu
565 570 575
Gly His Thr Tyr Lys Val Thr Phe Asn Ala Tyr Ala Ser Glu Ser Asp
580 585 590
Tyr Gln His Gly Ala Asp Ala Leu Gly Val Met Thr Val Thr Tyr Thr
595 600 605
Gly Thr Thr Pro Glu Phe Lys Phe Ser Pro Val Lys Phe Val Gly Ile
610 615 620
Thr Pro Asp Pro Asp Tyr Thr Val Ile Ser Asp Val Ser Gln Asn Glu
625 630 635 640
Phe Thr Leu Thr Phe Asp Gly Pro Val Val Leu Asn His Glu Asn Ala
645 650 655
Phe Val Val Tyr Gly Gln Gly Met Asn Tyr Pro Phe Glu Ser Ile Glu
660 665 670
Ser Asn Glu Asp Gly Thr Glu Trp Thr Val Lys Ala Ser Leu Glu Lys
675 680 685
Leu Met Glu Met Gly Met Met Pro Arg Leu Ser Met Ser Phe Met Pro
690 695 700
Val Asp Lys Asp Gly Met Leu Val Glu Gly Asn Ala Gly Lys Asp Ala
705 710 715 720
Thr Ser Tyr Leu Asn Phe Ala Tyr Asp Cys Thr Ile Gly Ile Pro Asp
725 730 735
Leu Gln Val Ser Pro Glu Ser Gly Ser Lys Val Glu Ser Leu Lys Glu
740 745 750
Ile Ile Val Gly Cys Lys Asp Ile Met Asp Asp Asn Gly Glu Val Ile
755 760 765
Phe Gln Gly Gly Ile Ser Glu Ser Tyr Met Ala Ala Glu Lys Ile Ile
770 775 780
Leu Tyr Lys Asn Gly Arg Glu Pro Val Ala Thr Val Thr Ser Ile Glu
785 790 795 800
Pro Ile Ile Pro Glu Asp Gln Glu Asp Asn Tyr Phe Tyr Val Pro Val
805 810 815
Glu Met Lys Leu Thr Leu Asp Thr Glu Ile Thr Asp Asn Gly Asn Tyr
820 825 830
Arg Leu His Ile Pro Ala Asn Tyr Phe Ile Leu Gly Thr Gly Met Ser
835 840 845
Asn Val Asn Ser Lys Glu Thr Ser Val Leu Tyr Thr Ile Asp Gln Pro
850 855 860
Ile Lys Ile Thr Val Thr Pro Glu Asn Asn Ser Thr Val Glu Ser Leu
865 870 875 880
Lys Glu Ile Thr Ile Glu Cys Glu Ser Gly Ile Asp Val Pro Ser Met
885 890 895
Gly Thr Ile Gln Leu Leu Asn Ala Gln Asn Glu Val Val Ala Ser Ala
900 905 910
Thr Gly Glu Asp Cys Glu Leu Leu Leu Pro Glu Gly Ala Gly Pro Trp
915 920 925
Asp Pro Tyr Thr Gly Val Thr Ile Lys Leu Asp Gln Glu Val Thr Glu
930 935 940
Lys Gly Thr Tyr Lys Leu Val Ile Pro Glu Gly Phe Phe Tyr Leu Gly
945 950 955 960
Glu Asn Tyr Glu Asn Ser Asp Glu Met Thr Phe Thr Tyr Ser Ile Asn
965 970 975
Ala Ser Gly Ile His Ser Ile Gly Thr Glu Ser Lys Gly Val Val Val
980 985 990
Tyr Thr Val Asp Gly Lys Phe Ile Leu Lys Ser Ser Asp Ala Lys Asp
995 1000 1005
Val Lys Asn Leu Lys Lys Gly Leu Tyr Ile Val Asn Gly Lys Lys
1010 1015 1020
Met Met Val Lys
1025
<210> 20
<211> 3084
<212> DNA
<213> Paraprevotella clara
<400> 20
atgagagaga aaccttacac gaaatggaag ggaggcctga ataggctttg ttttcttctt 60
atgtgttttt gttggactac ggttcagtca tgggccgttg gtgaagatgt acatttaacc 120
attgaaaatg gtaagacgta tgaatttgaa gctttcaaca gttattatct tacttatgta 180
gccacggcta acggacaatt atcattgtat caaaccggtg gagacttttg tcgtcagtat 240
acagataaca cttttgaaac ggaattgcct tctacccctc agtatgttaa tgagggaaaa 300
cttgtagaag tgaaggttga gtcaggtaag acttattatt tcttgactcg aggtttaagt 360
aaaggagaac tgaccgttac tttcggagaa aaagcaactc cacttgaatt actttcgctc 420
tccaaggaag agggaacgac attgaatctt tctattgata ccctcttagg cttcactttc 480
aatagaatgg taaaagtagg aaattgcaca ttgtcttccg ggtcggtaat cggaaacttg 540
accgcttcga cgcatgatta tggtttcacc gtttccatta aagatgtgtt gtataaatgg 600
ttgaaggagg gtaatgtaaa agccggagat gaagtcgttt tgactgtaac gggattatgt 660
aatgccaatg atgagaacga caaatacaat ggtaatggcg tgctgacggt gaaatatatt 720
gcaggagcgt tgcctgccga gttggttagc gttacgaact cacccgaaga tatgaacttc 780
ctgtcttatt acctgcctac tgatgaaagg gggttggtta ctttgacttt caaccgttcg 840
atgggcgaag atgctacggc aaaattattc tatggcgata tagaaggttc gaatgtctat 900
acggaaagtc ttccggtgaa ggtgaaagat actcagttga ttgtagacct ccgtggaaaa 960
cgtcgtttgc catctgatat gttgccgaat gccagtgctg atgtttggga acagtataaa 1020
accatttcct tgaaggttag caacgtgaag gatgtggaag gaaattatgc tatgtctacc 1080
tctcagggta ccacaggttc gtatgctttt aattatacaa tagaagctgt tgagtttgat 1140
atcgctcgtc agtttacacc agatgaaggc acatctttgg atgattatcc gacaatagaa 1200
ctttggattc gggaagacga ttcgtttact tatgagggtg tagatttcac ttatagcgtg 1260
aatggaaaac ctgagacggt atttgtaccc atcagccaga ttacgaaaaa agactctcca 1320
gacggagatg gtgtactatt gacgattccg gtgccgaata aagcagcgga tgagaattca 1380
gatgtcgtag tttctttgaa taatttgaaa gcggccgatg gtgcggatta ttcagaagat 1440
tttacgataa tatataaaac gtcaggaaag agtatggcag ggcttgagat agaatctgtg 1500
tctcctgctg atggagcagc tatcgcggct ttagaagctg gttcgtttct ggaattgtct 1560
accaatatga atgatcgtgt aggatatgta tggtttagaa tagatgacca gaacccgaag 1620
gatcctgatc aggcttgtgt taagacgatg acatcgatga agaaagagac ggttgaggga 1680
aaagtttctt tccggacaga aataattcgt agtgtgactt tctttgaagg gcatacttat 1740
aaagttactt tcaatgctta tgcttcggag agcgattatc agcacggtgc agatgctctt 1800
ggtgttatga ctgtgactta cacaggtaca actccggaat ttaagtttag tccggttaag 1860
tttgtaggaa taacgccgga tcctgattat actgtaattt cagacgtaag ccagaatgaa 1920
tttacgttga cttttgacgg tcctgtagta ttgaatcatg aaaatgcttt tgttgtatac 1980
ggtcagggaa tgaattatcc atttgaaagt atcgagtcga atgaagacgg tacggaatgg 2040
acggtaaagg catccttgga aaaattaatg gaaatgggta tgatgcctcg attgtctatg 2100
agctttatgc cggtggataa agacggaatg ttagttgaag gtaatgccgg taaagatgcg 2160
acaagttatt tgaattttgc ctatgattgt acgattggta taccagattt gcaagtgagt 2220
ccggaaagcg gttctaaagt ggaaagtttg aaagagatca tagtaggatg taaagatatc 2280
atggacgaca atggtgaggt gatatttcaa ggtggtatca gcgaatctta tatggcggcc 2340
gaaaagatca ttttgtataa gaacggacgt gagccggttg caacggtaac gagtatcgaa 2400
cctattatac cggaggacca agaggataat tacttctacg ttcctgtgga aatgaaatta 2460
acattggata ctgaaattac agataacggt aattatcgtt tgcacattcc tgccaactac 2520
tttatattgg gaacaggtat gtctaatgta aacagtaaag aaacaagtgt gctgtatact 2580
atcgaccagc cgattaagat aacggttacg ccagagaata actctacggt agaaagcctg 2640
aaagaaatta ctatcgaatg tgaatcaggt atagatgtgc cgagtatggg tacaattcaa 2700
ttattgaatg ctcagaatga ggtggtagct tcggcaacgg gtgaggattg cgaattgttg 2760
ttacctgaag gtgccggtcc ttgggatcct tatactggtg tgacaattaa attggatcaa 2820
gaggtgacag agaaaggtac atataagttg gttattcctg aaggattctt ctatttgggt 2880
gaaaactatg aaaattcgga cgagatgacg tttacttatt cgattaatgc cagtggtatt 2940
cattcgattg gtactgagtc taagggagta gtcgtttata ctgtagacgg taaatttatc 3000
ttaaagtcga gcgatgccaa agacgtgaag aatctgaaga aaggtttgta cattgtgaac 3060
ggtaagaaga tgatggtaaa ataa 3084
<210> 21
<211> 787
<212> PRT
<213> Paraprevotella xylaniphila
<400> 21
Met Arg Lys Met Phe Thr Leu Glu Met Lys Asp Tyr Leu Met Arg Met
1 5 10 15
Val Leu Phe Val Ala Val Cys Met Val Gly Val Gly Gly Leu Lys Ala
20 25 30
Ala Pro Gly Leu Ser Ala Asp Asp Pro Leu Ile Phe Glu Gly Gly Glu
35 40 45
Glu Tyr Asp Val Ser Gly Ser Tyr Phe Lys Asp Leu Tyr Ala Thr Phe
50 55 60
Thr Ala Pro Ser Asp Gly Val Leu Thr Leu Thr Phe Asp Ser Thr Asp
65 70 75 80
Pro Leu Asp Leu Tyr Thr Asp Asn Thr Tyr Gln Thr Met Val Glu Pro
85 90 95
Arg Leu Val Tyr Asn Asn Gly Val Cys Glu Leu Glu Val Lys Ala Gly
100 105 110
Ile Thr Tyr Tyr Phe His Arg Ser Phe Met Leu Ser Ala Arg Thr Ile
115 120 125
Ser Val Glu Phe Gly Lys Glu Ala Val Ala Leu Lys Leu Gln Arg Val
130 135 140
Ser Pro Thr Glu Gly Glu Thr Leu Phe Val Ser Asn Ala Thr Val Ser
145 150 155 160
Phe Thr Phe Asn Arg Asp Val Lys Met Asp Gly Ala Thr Leu Thr Ile
165 170 175
Gly Gly Lys Glu Lys Ser Ile Asp Ala Val Ser Thr Ser Ser Ser Val
180 185 190
Gln Phe Asp Leu Ser Ala Ile Met Met Glu Ala Tyr Asn Asp Gly Ile
195 200 205
Leu Lys Glu Gly Asp Asp Met Leu Leu Thr Leu Lys Gly Val Arg Asn
210 215 220
Ala Asn Lys Glu Ser Asp Ile Tyr Gly Glu Asp Gly Thr Cys Ser Leu
225 230 235 240
Ala Leu Lys Ala Ala Ala Lys Pro Val Val Leu Asn Ser Ser Val Asn
245 250 255
Thr Pro Gly Asn Gly Met Asp Ile Phe His Ser Tyr Ile Met Ser Gly
260 265 270
Asp Glu Asn Val Val Gln Leu Asn Phe Asp Gly Ala Leu Asp Thr Glu
275 280 285
Lys Lys Pro Val Ala Thr Leu Ile Tyr Gly Asp Ile Glu Ala Glu Asn
290 295 300
Gly Phe Tyr Gln Glu Thr Leu Asp Ala Gln Phe Leu Gly Glu Asn Thr
305 310 315 320
Val Thr Val Asp Leu Ser Gly Lys Ser Arg Arg His Lys Asp Met Leu
325 330 335
Pro Asn Tyr Ser Gly Ala Ala Phe Gly Thr Ile Ser Leu Lys Ile Ala
340 345 350
Gly Val Tyr Ala Ala Asp Gly Gln Pro Val Tyr Ser Gly Ala Gln Ser
355 360 365
Ser Val Gly Ser Cys Thr Phe Gln Tyr Ala Tyr Glu Glu Val Lys Val
370 375 380
Asp Ile Ala Val Ala Phe Asp Asp Leu Ala Glu Ser Asn Ser Ile Asp
385 390 395 400
Gly Leu Asp Ser Leu Thr Ile Trp Val Arg Gly Asp Glu Tyr Leu His
405 410 415
Tyr Thr Gly Val Gln Phe Asp Phe Met Val Asn Gly Gln Val Asn Ser
420 425 430
Ile Val Glu Lys Asp Ile Lys Lys Lys Ala Asp Ser Glu Glu Glu Gly
435 440 445
Ala Cys Ile Leu Arg Val Cys Val Pro Lys Ser Asp Ala Asp Ala Asn
450 455 460
Ser Glu Val Lys Val Ser Phe Val Gly Leu Glu Ala Ala Asp Gly Gly
465 470 475 480
Asp Tyr Ser Ala Asp Phe Ser Ala Val Phe Thr Thr Val Gly Asn Leu
485 490 495
Val Asn Val Pro Asp Leu Ile Leu Thr Pro Val Ser Gly Ser Thr Val
500 505 510
Glu Asn Ile Lys Glu Val Leu Val Gly Cys Ser Glu Gly Ile Arg Val
515 520 525
Asn Ala Asp Asn Lys Glu Lys Ile Gln Ile Trp Asp Arg Met Arg Asn
530 535 540
Val Val Ala Thr Ala Val Ser Met Glu Pro Val Ile Pro Glu Asp Glu
545 550 555 560
Lys Glu Asn Pro Asp Tyr Ile Pro Thr Glu Ile Lys Val Val Phe Asp
565 570 575
Asn Glu Val Lys Leu Pro Gly Ala Tyr Met Val Asn Leu Pro Ala Glu
580 585 590
Ala Phe Val Leu Gly Asn Gln Thr Ser Ile Leu Ser Lys Ala Thr Leu
595 600 605
Val Asp Tyr Met Ile Ile Gly Glu Val Ala Lys Asp Tyr Val Pro Ser
610 615 620
Glu Val Val Ala Glu Thr Glu Gly Ser Thr Val Ile Gly Phe Thr Ile
625 630 635 640
Lys Thr Pro Glu Trp Val Thr Leu Asp Glu Asn Phe Asp Thr Glu Lys
645 650 655
Ile Lys Ile Tyr Asn Ser Asn Thr Gln Glu Glu Val Val Gly Gly Lys
660 665 670
Ser Val Ser Tyr Ala Ala Ala Phe Glu Asp Phe Ser Ile Ala Leu Glu
675 680 685
Asn Pro Leu Thr Glu Lys Gly Thr Tyr Thr Leu Phe Ile Pro Ala Glu
690 695 700
Leu Phe Gly Asn Gly Ser Trp Ile Pro Gly Ser Thr Glu Gly Ser Cys
705 710 715 720
Asn Pro Glu Leu Thr Tyr Thr Ile Val Ile Asp Glu Asn Gly Val Thr
725 730 735
Val Gly Val Asn Ser Val Leu Ala Glu Ile Gly Thr Lys Val Thr Val
740 745 750
Tyr Asn Leu Asn Gly Val Arg Val Leu Tyr Asn Ala Asp Arg Asp Ala
755 760 765
Leu Lys Ala Leu Arg Lys Gly Ile Tyr Val Val Asn Gly Lys Lys Val
770 775 780
Val Ile Lys
785
<210> 22
<211> 2364
<212> DNA
<213> Paraprevotella xylaniphila
<400> 22
atgaggaaga tgtttacttt ggaaatgaaa gactatctga tgaggatggt tctttttgtg 60
gcagtgtgta tggtgggtgt tggaggattg aaagctgccc ccggattgtc agctgatgat 120
cctttgatct ttgaaggtgg agaagaatat gatgtttcgg gaagttattt taaagatttg 180
tatgctacgt ttacggctcc ttcggacgga gtattgactt tgacatttga tagtaccgat 240
cctttggatt tatatacgga taatacatat caaacgatgg tggagccacg tcttgtttac 300
aataatgggg tttgcgaatt ggaagtaaaa gccgggatta cttattattt tcaccgttct 360
tttatgttga gcgcacgaac catttctgtg gaatttggaa aggaagctgt tgctttgaag 420
ttacaacgag tttctccaac agagggtgag acattgtttg tttccaatgc tacagtaagt 480
ttcactttca atagggatgt taaaatggat ggggctactc ttacaattgg aggaaaagaa 540
aagtctattg atgcggtttc gacatcgtct tctgtacaat ttgacctatc tgccattatg 600
atggaggcat acaatgatgg aatcctaaag gaaggagatg atatgctctt aactttgaag 660
ggagtacgga atgcgaataa agaaagtgat atttatggag aagacgggac ttgctccttg 720
gctttaaaag cagcagcaaa acctgtggtt ctaaacagtt ctgtaaatac gcctggaaat 780
ggaatggata tatttcattc ttatattatg tctggggacg agaatgttgt acagttgaat 840
tttgatggtg ctttggatac agagaagaaa ccggttgcaa cattaatata tggggatata 900
gaagcggaga atggttttta tcaagaaacc ttagatgcgc aattcttagg tgaaaatact 960
gtaactgtag atttgagcgg aaaatcgcgt cggcataaag atatgctgcc gaactattcc 1020
ggcgctgcct ttggaactat ttctttgaaa attgcagggg tgtatgcagc tgacggacaa 1080
cctgtttatt ctggtgctca atcctcagta ggttcttgta cgtttcaata tgcctatgag 1140
gaagtaaagg tggatatagc tgtggctttt gatgacttgg cagaaagtaa ttctattgac 1200
ggattagact cattaacaat ttgggttcgt ggtgatgaat acttgcatta cactggagtg 1260
cagtttgatt ttatggtaaa tggtcaagtt aattctattg tagagaaaga tattaagaaa 1320
aaagcggatt ctgaagaaga aggtgcctgt atattaagag tatgtgtacc gaaaagtgat 1380
gccgatgcca attcggaggt aaaggtcagc tttgttggcc ttgaagcagc tgacggcggg 1440
gattactccg cagatttttc tgctgttttc acaacagtag gaaacttggt aaatgtaccg 1500
gatttgatac tcactccagt gagcggttct acagtggaaa atattaaaga ggtattagtt 1560
ggttgttctg agggtattcg cgtgaatgca gacaacaagg agaaaataca gatatgggat 1620
agaatgcgga acgtagtagc tacagcggta agtatggagc ctgttattcc tgaagatgag 1680
aaggaaaatc cggattacat accgacagaa ataaaggtgg tttttgataa tgaagtgaag 1740
ctccccggag cttatatggt gaatcttccg gcagaggcat tcgttttagg aaatcagaca 1800
tctattttaa gtaaggcaac ccttgtggat tatatgatta taggggaagt tgcgaaagat 1860
tatgttccga gcgaagttgt agccgagaca gaaggaagta cagtgatagg ttttacgata 1920
aagacaccag aatgggtgac tttggatgag aattttgaca cagagaaaat caagatttat 1980
aattcgaata ctcaggaaga agttgtgggt ggaaaatcgg taagttatgc agcagctttt 2040
gaagatttta gtatcgcatt ggagaatcct ctcacagaga aaggtactta tacgttgttt 2100
attccggcag aattgtttgg aaatggttct tggattcccg gttctacaga agggtcttgt 2160
aatccggaat taacttatac gatagttata gacgagaatg gggtgacagt cggagtaaat 2220
tcagtattgg ctgaaattgg gacaaaagta acggtgtaca atcttaatgg tgtacgtgtg 2280
ctttataatg cagaccgtga tgcattgaag gctttaagaa aaggaattta tgtcgtgaat 2340
ggaaagaagg tggtgattaa gtaa 2364
<210> 23
<211> 787
<212> PRT
<213> Paraprevotella xylaniphila
<400> 23
Met Arg Lys Met Phe Thr Leu Glu Met Lys Asp Tyr Leu Met Arg Met
1 5 10 15
Val Leu Phe Val Ala Val Cys Met Val Gly Val Gly Gly Leu Lys Ala
20 25 30
Ala Pro Gly Leu Ser Ala Asp Asp Pro Leu Ile Phe Glu Gly Gly Glu
35 40 45
Glu Tyr Asp Val Ser Gly Ser Tyr Phe Lys Asp Leu Tyr Ala Thr Phe
50 55 60
Thr Ala Pro Ser Asp Gly Val Leu Thr Leu Thr Phe Asp Ser Thr Asp
65 70 75 80
Pro Leu Asp Leu Tyr Thr Asp Asn Thr Tyr Gln Thr Met Val Glu Pro
85 90 95
Arg Leu Val Tyr Asn Asn Gly Val Cys Glu Leu Glu Val Lys Ala Gly
100 105 110
Ile Thr Tyr Tyr Phe His Arg Ser Phe Met Leu Ser Ala Arg Thr Ile
115 120 125
Ser Val Glu Phe Gly Lys Glu Ala Val Ala Leu Lys Leu Gln Arg Val
130 135 140
Ser Pro Thr Glu Gly Glu Thr Leu Phe Val Ser Asn Ala Thr Val Ser
145 150 155 160
Phe Thr Phe Asn Arg Asp Val Lys Met Asp Gly Ala Thr Leu Thr Ile
165 170 175
Gly Gly Lys Glu Lys Ser Ile Asp Ala Val Ser Thr Ser Ser Ser Val
180 185 190
Gln Phe Asp Leu Ser Ala Ile Met Met Glu Ala Tyr Asn Asp Gly Ile
195 200 205
Leu Lys Glu Gly Asp Asp Met Leu Leu Thr Leu Lys Gly Val Arg Asn
210 215 220
Ala Asn Lys Glu Ser Asp Ile Tyr Gly Glu Asp Gly Thr Cys Ser Leu
225 230 235 240
Thr Leu Lys Ala Ala Ala Lys Pro Val Val Leu Asn Ser Ser Val Asn
245 250 255
Thr Pro Gly Asn Gly Met Asp Ile Phe His Ser Tyr Val Met Ser Gly
260 265 270
Asp Glu Asn Val Val Gln Leu Asn Phe Asp Gly Ala Leu Asp Thr Glu
275 280 285
Lys Lys Pro Val Ala Thr Leu Ile Tyr Gly Asp Ile Glu Ala Glu Asn
290 295 300
Gly Phe Tyr Gln Glu Thr Leu Asp Ala Gln Phe Leu Gly Glu Asn Thr
305 310 315 320
Val Thr Val Asp Leu Ser Gly Lys Ser Arg Arg His Lys Asp Met Leu
325 330 335
Pro Asn Tyr Ser Gly Ala Ala Phe Gly Thr Ile Ser Leu Lys Ile Ala
340 345 350
Gly Val Tyr Ala Ala Asp Gly Gln Pro Val Tyr Ser Gly Ala Gln Ser
355 360 365
Ser Val Gly Ser Cys Thr Phe Gln Tyr Ala Tyr Glu Glu Val Lys Val
370 375 380
Asp Ile Ala Val Ala Phe Asp Asp Leu Ala Gly Ser Asn Ser Ile Asp
385 390 395 400
Gly Leu Asp Ser Leu Thr Ile Trp Val Arg Gly Asp Glu Tyr Leu His
405 410 415
Tyr Thr Gly Val Gln Phe Asp Phe Met Val Asn Gly Gln Val Asn Ser
420 425 430
Ile Val Glu Lys Asp Ile Lys Lys Lys Ala Asp Ser Glu Glu Glu Gly
435 440 445
Ala Cys Ile Leu Arg Val Cys Val Pro Lys Ser Asp Ala Asp Ala Asn
450 455 460
Ser Glu Val Lys Val Ser Phe Val Gly Leu Glu Ala Ala Asp Gly Gly
465 470 475 480
Asp Tyr Ser Ala Asp Phe Ser Ala Val Phe Thr Thr Val Gly Asn Leu
485 490 495
Val Asn Val Pro Asp Leu Ile Leu Thr Pro Val Ser Gly Ser Thr Val
500 505 510
Glu Asn Ile Lys Glu Val Leu Val Gly Cys Ser Glu Gly Ile Arg Val
515 520 525
Asn Ala Asp Asn Lys Glu Lys Ile Gln Ile Trp Asp Arg Met Arg Asn
530 535 540
Val Val Ala Thr Ala Val Ser Met Glu Pro Val Ile Pro Glu Asp Glu
545 550 555 560
Lys Glu Asn Pro Asp Tyr Ile Pro Thr Glu Ile Lys Val Val Phe Asp
565 570 575
Asn Glu Val Lys Leu Pro Gly Ala Tyr Met Val Asn Leu Pro Ala Glu
580 585 590
Ala Phe Val Leu Gly Asn Gln Thr Ser Ile Leu Ser Lys Ala Thr Leu
595 600 605
Val Asp Tyr Met Ile Ile Gly Glu Val Ala Lys Asp Tyr Val Pro Ser
610 615 620
Glu Val Val Ala Glu Thr Glu Gly Ser Thr Val Ile Gly Phe Thr Ile
625 630 635 640
Lys Thr Pro Glu Trp Val Thr Leu Asp Glu Asn Phe Asp Thr Glu Lys
645 650 655
Ile Lys Ile Tyr Asn Ser Asn Thr Gln Glu Glu Val Val Gly Gly Lys
660 665 670
Ser Val Ser Tyr Ala Ala Ala Phe Glu Asp Phe Ser Ile Ala Leu Glu
675 680 685
Asn Pro Leu Thr Glu Lys Gly Thr Tyr Thr Leu Phe Ile Pro Ala Glu
690 695 700
Leu Phe Gly Asn Gly Ser Trp Ile Pro Gly Ser Thr Glu Gly Ala Cys
705 710 715 720
Asn Pro Glu Leu Thr Tyr Thr Ile Val Ile Asp Glu Asn Gly Val Thr
725 730 735
Val Gly Val Asn Ser Val Leu Ala Glu Ile Gly Thr Lys Val Thr Val
740 745 750
Tyr Asn Leu Asn Gly Val Arg Val Leu Tyr Asn Ala Asp Arg Asp Ala
755 760 765
Leu Lys Ala Leu Arg Lys Gly Ile Tyr Val Val Asn Gly Lys Lys Val
770 775 780
Val Ile Lys
785
<210> 24
<211> 2364
<212> DNA
<213> Paraprevotella xylaniphila
<400> 24
atgaggaaga tgtttacttt ggaaatgaaa gactatctga tgaggatggt tctttttgtg 60
gcagtgtgta tggtgggtgt tggaggattg aaagctgccc ccggattgtc agctgatgat 120
cctttgatct ttgaaggtgg agaagaatat gatgtttcgg gaagttattt taaagatttg 180
tatgctacgt ttacggctcc ttcggacgga gtattgactt tgacatttga tagtaccgat 240
cctttggatt tatatacgga taatacatat caaacgatgg tggagccacg tcttgtttac 300
aataatgggg tttgcgaatt ggaagtaaaa gccgggatta cttattattt tcaccgttct 360
tttatgttga gcgcacgaac catttctgtg gaatttggaa aggaagctgt tgctttgaag 420
ttacaacgag tttctccaac agagggtgag acattgtttg tttccaatgc tacagtaagt 480
ttcactttca atagggatgt taaaatggat ggggctactc ttacaattgg aggaaaagaa 540
aagtctattg atgcggtttc gacatcgtct tctgtacaat ttgacctatc tgccattatg 600
atggaggcat acaatgatgg aatcctaaag gaaggagatg atatgctctt aactttgaag 660
ggagtacgga atgcgaataa agaaagtgat atttatggag aagacgggac ttgctccttg 720
actttaaaag cagcagcaaa acctgtggtt ctaaacagtt ctgtaaatac gcctggaaat 780
ggaatggata tatttcattc ttatgttatg tctggggacg agaatgttgt acagttgaat 840
tttgatggtg ctttggatac agagaagaaa ccggttgcaa cattaatata tggggatata 900
gaagcggaga atggttttta tcaagaaacc ttagatgcgc aattcttagg tgaaaatact 960
gtaactgtag atttgagcgg aaaatcgcgt cggcataaag atatgctgcc gaactattcc 1020
ggcgctgcct ttggaactat ttctttgaaa attgcagggg tgtatgcagc tgacggacaa 1080
cctgtttatt ctggtgctca atcctcagta ggttcttgta cgtttcaata tgcctatgag 1140
gaagtaaagg tggatatagc tgtggctttt gatgacttgg caggaagtaa ttctattgac 1200
ggattagact cattaacaat ttgggttcgt ggtgatgaat acttgcatta cactggagtg 1260
cagtttgatt ttatggtaaa tggtcaagtt aattctattg tagagaaaga tattaagaaa 1320
aaagcggatt ctgaagaaga aggtgcctgt atattaagag tatgtgtacc gaaaagtgat 1380
gccgatgcca attcggaggt aaaggtcagc tttgttggcc ttgaagcagc tgacggcggg 1440
gattactccg cagatttttc tgctgttttc acaacagtag gaaacttggt aaatgtaccg 1500
gatttgatac tcactccagt gagcggttct acagtggaaa atattaaaga ggtattagtt 1560
ggttgttctg agggtattcg cgtgaatgca gacaacaagg agaaaataca gatatgggat 1620
agaatgcgga acgtagtagc tacagcggta agtatggagc ctgttattcc ggaagatgag 1680
aaggaaaatc cggattacat accgacagaa ataaaggtgg tttttgataa tgaagtgaag 1740
ctccccggag cttatatggt gaatcttccg gcagaggcat tcgttttagg aaatcagaca 1800
tctattttaa gtaaggcaac ccttgtggat tatatgatta taggggaagt tgcgaaagat 1860
tatgttccga gcgaagttgt agccgagaca gaaggaagta cagtgatagg ttttacgata 1920
aagacaccag aatgggtgac tttggatgag aattttgaca cagagaaaat caagatttat 1980
aattcgaata ctcaggaaga agttgtgggt ggaaaatcgg taagttatgc agcagctttt 2040
gaagatttta gtatcgcatt ggagaatcct ctcacagaga aaggtactta tacgttgttt 2100
attccggcag aattgtttgg aaatggttct tggattcccg gttctacaga aggggcttgt 2160
aatccggaat taacttatac gatagttata gacgagaatg gggtgacagt cggagtaaat 2220
tcagtattgg ctgaaattgg gacaaaagta acggtgtaca atcttaatgg tgtacgtgtg 2280
ctttataatg cagaccgtga tgcattgaag gctttaagaa aaggaattta tgtcgtgaat 2340
ggaaagaagg tggtgattaa gtaa 2364
<210> 25
<211> 542
<212> PRT
<213> Prevotella rara
<400> 25
Met Lys Arg Leu Ile Phe Thr Leu Ile Gly Ile Val Leu Gly Leu Thr
1 5 10 15
Gly Leu Lys Ala Val Thr Phe Thr Pro Ile Glu Ile Asn Asp Pro Ser
20 25 30
Glu Ser Val Thr Ile Lys Leu Gln Pro Gly Ala Asn Tyr Leu Gln Leu
35 40 45
Lys Ala Ile Glu Ser Asn Thr Ile His Phe Asn Val Gly Tyr Phe Gly
50 55 60
Ile Met Met Phe Glu Cys Asn Ala Met Gly Glu Glu Gly Asn Asn Leu
65 70 75 80
Ser Ile Ser Tyr Asp Ala Glu Gly His Lys Ile Phe Ser Ser Glu Val
85 90 95
Glu Glu Gly Lys Thr Tyr Tyr Phe Ser Thr Ser Met Ile Thr Glu Pro
100 105 110
Ser Ile Asp Val Thr Ile Tyr Tyr Gly Asn Gly Glu Gly Met Pro Ile
115 120 125
Thr Leu Thr Ser Asn Phe Ser Asp Gly Asp Thr Tyr Thr Val Ser Gly
130 135 140
Ser Asn Leu Glu Leu Ala Phe Asp Arg Thr Val Asp Ile Ala His Asn
145 150 155 160
Trp Ile Glu Tyr Gly Glu Glu Ala Asp Gly Val Phe Lys Thr Lys Glu
165 170 175
Glu Ile Pro Ala Ala Tyr Ile Asn Gly Thr Tyr Thr Thr Gln Tyr Phe
180 185 190
Tyr Ser Ile Glu Leu Ser Lys Phe Ile Arg Glu Met Ala Glu Asp Gly
195 200 205
Lys Leu Glu Val Gly Asp Lys Phe Lys Ile Thr Leu Glu Gly Ile Cys
210 215 220
Asp Ala Asn Asp Glu Ser Val Ile Tyr Gly Glu Asp Gly Asn Tyr Ser
225 230 235 240
Val Thr Leu Ile Met Gly Glu Met Pro Gly Glu Leu Val Ser Ile Asp
245 250 255
Ser Ala Gln Gly Thr Thr Leu Tyr Thr Tyr Tyr Pro Glu Thr Gly Glu
260 265 270
Glu Gly Leu Leu Thr Phe Thr Phe Thr Asp Glu Leu Asn Thr Asp Lys
275 280 285
Ser Lys Val Thr Val Asp Leu Ser Tyr Gly Asp Ser Glu Ala Gly Ser
290 295 300
Met Gly Ser Phe Asn Pro Asp Phe Thr Ile Glu Gly Lys Thr Val Val
305 310 315 320
Val Asp Ile Arg Gly Tyr Arg Phe Pro Glu Thr Val Glu Thr Ser Arg
325 330 335
Asn Gly Glu Val Gln Thr Val Ile Thr Met Asn Ile Lys Gly Leu Thr
340 345 350
Thr Ala Asp Gly Arg Ala Ile Leu Thr Asn Asn Ala Ser Ala Gly Thr
355 360 365
Ser Gly Ile Val Val Thr Tyr Pro Leu Lys Lys Gln Glu Ile Ser Phe
370 375 380
Tyr Tyr Asp Phe Leu Pro Thr Glu Gly Ser Ser Leu Ala Asp Cys Ser
385 390 395 400
Glu Ile Ile Ile Trp Leu Pro Glu Glu Val Gly Val Leu Thr Phe Asp
405 410 415
Lys Val Leu Leu Gln Trp Leu Asn Asn Arg Gly Thr Leu Gln Thr Lys
420 425 430
Glu Phe Lys Ala Glu Asp Val Pro Phe Ala Tyr Asp Lys Ser Tyr Ala
435 440 445
Gly Tyr Val Ser His Ile Pro Leu Ala Gly Val Ser Lys Asp Arg Glu
450 455 460
Val Thr Leu Thr Val Glu Gly Gly Met Leu Ser Asn Gly Asp Ser Val
465 470 475 480
Glu Ile Thr Gly Lys Phe Asn Thr Asp Pro Thr Gly Ile Asp Ser Val
485 490 495
Leu Gly Asp Asp Ala Asn Ala Val Val Lys Leu Tyr Thr Ile Asp Gly
500 505 510
Ile Leu Val Lys Glu Ala Pro Ala Ala Thr Val Leu Thr Gly Val Lys
515 520 525
Lys Gly Val Tyr Ile Met Asn Gly Lys Lys Val Val Val Lys
530 535 540
<210> 26
<211> 1629
<212> DNA
<213> Prevotella rara
<400> 26
atgaaacgac ttatttttac tttaatcggt attgtattgg ggttgacagg tttaaaggct 60
gttactttta ctccgattga gattaatgac ccgtctgagt ctgtaactat taaattgcag 120
ccaggtgcaa actatttaca gctgaaggct attgagagca acaccatcca ctttaatgtc 180
ggatattttg gcattatgat gtttgaatgt aatgccatgg gtgaagaggg caacaacctt 240
tctataagct atgacgctga aggtcacaaa atcttcagta gtgaggttga ggaaggaaaa 300
acttattatt tctcaaccag tatgattact gagccttcaa tagatgttac aatttattat 360
ggtaacggag aaggcatgcc tatcacattg acatctaact tttcggatgg tgatacctat 420
acagtttctg gctcaaacct tgaattggct tttgaccgta cggttgatat agcccacaac 480
tggattgagt atggtgagga ggctgacggc gtatttaaaa caaaggaaga gattcccgca 540
gcatatatca atggtaccta tactactcag tatttctata gcatcgagct gagcaagttt 600
atccgtgaga tggctgaaga cggtaaactt gaggttggtg ataagtttaa aatcactctt 660
gaaggtattt gcgatgctaa cgatgagagc gtaatctatg gtgaggacgg aaactattct 720
gttacactga ttatgggcga gatgcccggt gagctggttt ctatagattc ggcacaaggt 780
acaactcttt atacatatta tcctgaaacc ggtgaagaag gtttgctgac atttactttc 840
acagatgaac ttaatactga taagagtaag gtaactgtag acctgtctta tggcgattca 900
gaggccggtt ctatgggttc atttaatcct gattttacta tcgaaggcaa gactgttgtt 960
gttgatatcc gcggttaccg cttcccagag acagtagaga cttcaagaaa cggtgaagtg 1020
cagactgtga taacaatgaa catcaagggc cttacaactg cagacggacg tgctattctc 1080
acaaacaacg cttctgccgg aacaagtggc atcgttgtta cttatcctct gaagaagcag 1140
gaaatatctt tttattatga cttcttgcct actgaaggca gctctctggc tgattgcagc 1200
gagatcataa tctggttgcc ggaggaagtt ggtgttctta ctttcgacaa ggttctgttg 1260
cagtggctca acaaccgcgg cacactccag acaaaggaat tcaaggcaga ggatgttccg 1320
tttgcttatg ataaatctta tgcgggttat gtaagccata ttccacttgc cggcgtaagc 1380
aaggaccgtg aggttacatt gactgtagaa ggtggaatgc tcagcaatgg cgacagcgtt 1440
gagattacag gtaaattcaa cactgatcct acaggtatag attctgttct tggtgacgat 1500
gccaatgccg ttgttaagct ttacacaatc gacggtatcc tcgttaagga ggctcctgct 1560
gcaactgttc ttacaggtgt taagaagggc gtttatatca tgaacggcaa gaaggttgtt 1620
gtaaaataa 1629
<210> 27
<211> 1027
<212> PRT
<213> Prevotella rodentium
<400> 27
Met Phe Ser Phe Met Gly Ile Gly Gln Ala Ile Ala Asp Val Val Gly
1 5 10 15
Ser Gly Ser Lys Glu Asp Pro Tyr Val Leu Glu Asn Gly Thr Thr Leu
20 25 30
Thr Leu Lys Ala Tyr Gln Ser Phe Tyr Ala Lys Phe Thr Ala Pro Ala
35 40 45
Asp Gly Ile Phe Ser Leu Tyr Thr Lys Asp Ser Tyr Ala Leu Tyr Thr
50 55 60
Asp Glu Ser Phe Ser Gln Ile Asp Glu Ser Ala Asn Ile Thr Phe Asn
65 70 75 80
Gly Ser Tyr Ser Glu Thr Ala Tyr Tyr Phe Asn Cys Lys Ala Gly Val
85 90 95
Thr Tyr Tyr Ile Gly Asn Ser Phe Val Met Val Gly Gly Thr Val Thr
100 105 110
Ala Lys Phe Asn Thr Glu Ala Glu Pro Leu Val Leu Arg Glu Ile Ser
115 120 125
Pro Glu Ala Gly Ser Val Phe Asn Ala Gly Lys Gly Ser Val Glu Leu
130 135 140
Thr Phe Asn Gln Lys Val Gln Ile Ala Ser Val Thr Val Asn Ala Gly
145 150 155 160
Thr Ser Ser Lys Thr Met Ser Ala Asn Val Asn Gly Val Tyr Val Ser
165 170 175
Val Asp Val Lys Asp Trp Leu Asn Gln Met Tyr Asp Glu Gly Lys Val
180 185 190
Lys Glu Gly Asn Glu Ile Ser Phe Lys Phe Lys Gly Val Ala Pro Thr
195 200 205
Ile Ser Pro Ser Lys Leu Tyr Asn Gly Asn Gly Glu Leu Asp Ile Thr
210 215 220
Tyr Thr Ala Gly Lys Lys Pro Leu Gln Leu Val Ser Ser Thr Asn Thr
225 230 235 240
Pro Met Ser Thr Pro Ala Val Thr Thr Phe Lys Ser Phe Tyr Thr Ser
245 250 255
Asp Asp Asn Gln Gly Ile Val Thr Leu Glu Phe Ser Asp Glu Val Asn
260 265 270
Phe Ser Glu Gly Asn Lys Pro Thr Ala Thr Leu Thr Tyr Gly Asn Met
275 280 285
Asp Gln Asp Ala Asp Gly Glu Tyr Tyr Thr Glu Ser Leu Pro Ile Leu
290 295 300
Pro Leu Gly Ser Asn Ile Leu Met Val Asn Leu Lys Asp Lys Leu Arg
305 310 315 320
Arg Ala Gln Asp Met Val Thr Ser Gly Thr Val Tyr Asp Gly Ile Thr
325 330 335
Leu Ser Ile Ser His Val Lys Asp Met Asp Gly Asn Tyr Ala Tyr Gly
340 345 350
Ser Gly Ser Gly Val Leu Gly Ser Phe Ala Phe Asn Tyr Lys Leu Ser
355 360 365
Glu Ile Asn Tyr Thr Val Asp Thr Asp Trp Ala Leu Leu Asp Ala Thr
370 375 380
Gly Asn Pro Thr Asn Lys Asp Val Ile Asp Ser Asp Thr Lys Ser Ile
385 390 395 400
Glu Leu Trp Met Ser Glu Asn Asp Gly Gln Ile Thr Phe Asn Gly Val
405 410 415
Glu Phe Lys Tyr Thr Glu Ser Gly Thr Glu Lys Val Lys Thr Met Thr
420 425 430
Leu Ser Glu Ile Lys Val Asp Asn Asn Gly Asn Glu Thr Thr Ile Thr
435 440 445
Ile Pro Val Pro Asn Ile Ala Ala Asp Ala Gly Ser Asp Ile Ala Ile
450 455 460
Ser Leu Lys Asp Val Glu Arg Pro Asp Gly Ile Thr Thr Ser Ile Asp
465 470 475 480
Pro Ala Ala Leu Ser Tyr Phe Thr Lys Thr Phe Thr Thr Thr Gly Ile
485 490 495
Thr Glu Ser Lys Phe Asp Ile Thr Ser Ala Ile Trp Met Tyr Glu Glu
500 505 510
Asn Glu Gly Asn Ile Val Glu Val Asn Met Ile Asn Gly Asn Ile Gly
515 520 525
Val Leu Thr Arg Gly Ser Lys Ser Val Ile Lys Thr Asn Lys Asp Asn
530 535 540
Glu Ile Gly Tyr Val Glu Trp Glu Ile Arg Gly Val Asp Asn Pro Asp
545 550 555 560
Met Glu Tyr Ile Arg Ala Gly Tyr Val Asp Thr Thr Met Thr Gly Lys
565 570 575
Leu Val Asp Gly Phe Thr Ile Glu Trp Arg Gly Glu Gly Leu Thr Ala
580 585 590
Gly Lys Asp Tyr Thr Phe Thr Leu Gln Ala Trp Lys Asn Glu Ala Asp
595 600 605
Lys Asn Ser Gly Ala Glu Pro Asn Val Gly Glu Ala Met Phe Ile Ile
610 615 620
His Gly Thr Lys Gln Ala Tyr Ile Tyr Ser Asp Val Val Met Lys Thr
625 630 635 640
Asp Ile Ser Gln Pro Ile Arg Leu Ala Ser Ala Asn Asp Asn Tyr Arg
645 650 655
Thr Ile Glu Phe Ser Ala Pro Val Thr Leu Asn Ala Val Val Asn Leu
660 665 670
Gly Met Gly Thr Ser Ala Asp Cys Thr Val Glu Pro Ser Ala Asp Arg
675 680 685
Thr Ala Trp Thr Val Thr Ile Pro Glu Tyr Val Met Ser Gln Tyr Gly
690 695 700
Glu Phe Ser Val Asn Val Phe Ala Lys Asp Asp Glu Gly Arg Ala Val
705 710 715 720
Asn Lys Thr Glu Asn Gly Leu Gly Ile Ile Met Gly Thr Glu Asp Asn
725 730 735
Thr Trp Phe Gln Ile Asp Phe Val Ser Glu Ser Leu Ala Pro Asp Phe
740 745 750
Thr Val Thr Pro Ala Asn Glu Ser Val Leu Glu Ser Leu Ser Thr Val
755 760 765
Thr Phe Gly Tyr Glu Gly Ser Ile Ser Ile Asn Trp Asn Asn Ser Glu
770 775 780
Lys Ile Thr Ile Tyr Asn Arg Thr Thr Arg Glu Lys Ile Ala Glu Phe
785 790 795 800
Ser Gly Asp Asp Val Val Leu Asp Glu Asp Pro Asp Asp Tyr Trp Ala
805 810 815
Pro Ile Leu Ser Cys His Ile Thr Leu Pro Glu Pro Val Thr Ala Ile
820 825 830
Gly Val Tyr Asp Val Asn Val Pro Ala Gly Phe Phe Val Leu Gly Glu
835 840 845
Gln Phe Asp Ser Asn Leu Ser Lys Ala Thr Ser Ile Val Tyr Glu Ile
850 855 860
Lys Glu Pro Val Glu Pro Phe Gly Ile Glu Ile Ser Pro Thr Ala Gly
865 870 875 880
Leu Val Ser Glu Ile Pro Ser Lys Leu Ile Val Thr Val Thr Asp Arg
885 890 895
Ser Ser Cys Asn Phe Thr Ala Asn Pro Thr Leu Thr Asp Asn Thr Gly
900 905 910
Asn Ser Tyr Pro Val His Asn Asp Phe Asp Trp Gly Ile Pro Glu Met
915 920 925
Asn Lys Phe Val Ile Ile Leu Asp Asn Gly Ala Ile Thr Ala Asp Gly
930 935 940
Ile Tyr Thr Leu Thr Ile Pro Ala Gly Ser Ile Ile Gly Asp Asp Glu
945 950 955 960
Thr Asp Leu Asn Lys Glu Asp Phe Val Phe Ile Tyr Thr Ile Asn Leu
965 970 975
Ala Gly Ile Thr Glu Leu Val Asn Asn Glu Gly Gly Lys Val Asp Val
980 985 990
Tyr Thr Leu Lys Gly Thr Leu Leu Met Lys Asp Ala Asp Ala Ser Ala
995 1000 1005
Val Asn Lys Leu Ala Lys Gly Met Tyr Ile Ile Asn Gly Lys Lys
1010 1015 1020
Val Phe Ile Arg
1025
<210> 28
<211> 3084
<212> DNA
<213> Prevotella rodentium
<400> 28
atgttttcat ttatgggcat cggacaggcc atagccgatg ttgtgggttc gggatcaaaa 60
gaagatccgt acgtgcttga aaacggcaca acactgacat tgaaagcata ccagtctttc 120
tatgccaaat tcacggcccc ggctgacggg atcttctcat tatatacaaa agacagctat 180
gccctatata ccgacgagag tttcagtcag attgacgagt cggcaaacat tacattcaac 240
ggaagctaca gtgagacggc atactacttt aactgtaaag caggtgttac ctattatata 300
ggaaacagtt ttgtaatggt cggaggtacg gttacagcca aattcaacac agaggctgaa 360
ccattggttc tgcgtgaaat atcacctgag gccggttctg ttttcaatgc aggcaaagga 420
agcgtagaac tgacattcaa ccaaaaagtc caaatagctt ctgttactgt caatgctggc 480
acatcctcaa agacaatgtc agccaatgta aatggagtat atgtatctgt agacgtaaaa 540
gactggctga accaaatgta tgatgaaggt aaggtcaaag aaggaaatga gatcagcttc 600
aagttcaaag gcgtggcccc aaccatttct ccaagcaaat tatacaacgg caatggagag 660
ttggacatta catacacagc aggcaagaaa cccctgcaac ttgtaagcag taccaacacc 720
ccaatgagca ctcctgccgt aaccacattc aagtcattct atacatccga tgacaaccaa 780
ggcatcgtaa cactggaatt cagtgacgaa gtgaactttt cggaaggcaa caaaccaacc 840
gcaaccctta catatggaaa tatggaccaa gatgccgacg gcgaatacta cacagaatca 900
ctccccatac ttccacttgg aagcaacatt ttaatggtta atctcaagga caagctgcgc 960
cgtgcacagg acatggtgac ttccggcact gtttacgacg gaataacact atccatatca 1020
cacgtaaagg atatggacgg aaactatgcc tatggctcgg gctccggtgt actcggatcg 1080
ttcgcattca actacaaact atctgaaata aactataccg ttgatacaga ctgggcacta 1140
cttgacgcaa ccggcaatcc gacaaacaaa gacgtcatag attcagacac caaaagcata 1200
gaactatgga tgagcgaaaa cgacggtcag ataacattca acggagtaga attcaaatat 1260
accgaaagcg gtactgaaaa ggtcaagaca atgaccctca gcgagattaa agttgacaac 1320
aatggaaacg aaacaacaat aacaatcccc gttcccaaca tcgcagccga tgccggtagc 1380
gacatcgcca tatctctcaa agacgtggaa cgccccgacg gtattaccac cagtatcgat 1440
ccggccgcac taagctattt cacaaaaaca ttcacaacaa caggcataac tgaaagcaaa 1500
ttcgacatta caagtgccat atggatgtat gaagagaatg aaggcaatat tgtcgaagtc 1560
aatatgataa acggtaatat cggtgtactg acaagaggca gcaaatctgt aatcaaaacc 1620
aacaaagaca atgagattgg ttatgtagaa tgggaaatca gaggcgttga caatcctgat 1680
atggaatata tcagagctgg atatgtagat acaaccatga caggcaagtt ggtagacggc 1740
tttaccatag aatggagagg agaaggcttg acagccggca aagactacac tttcacgctt 1800
caggcatgga agaatgaagc cgacaaaaac agtggtgccg agcctaatgt aggtgaagca 1860
atgttcatca tacacggtac aaaacaagcc tacatttaca gcgatgttgt gatgaagact 1920
gatataagcc agcctatcag acttgcttct gccaatgaca attaccgtac catcgagttc 1980
agcgctcctg taacgttaaa tgccgtggta aaccttggca tggggacatc ggcagattgc 2040
accgttgaac cctcggccga ccgtactgca tggacagtca caatacctga atatgtaatg 2100
tcacaatatg gagagttcag cgtgaatgtc tttgcaaaag atgacgaagg gcgtgcagta 2160
aacaagactg agaacggact gggtataata atggggacag aagacaatac gtggttccag 2220
attgattttg tcagtgagag ccttgccccg gactttacag taacaccggc aaacgaatca 2280
gtgctcgaga gtctcagcac cgtcactttc ggttatgaag gcagtatcag catcaactgg 2340
aacaacagtg agaaaatcac catatataac agaacaacac gtgagaagat agcagagttc 2400
agcggtgatg acgtcgtgtt ggatgaagat ccggatgact attgggctcc aatactctca 2460
tgccacatca cacttcccga gcctgtcact gctattggag tctacgatgt aaatgttcct 2520
gccggtttct tcgttcttgg tgagcagttc gacagtaact taagtaaggc aacaagcata 2580
gtatacgaaa taaaggaacc ggttgagccg ttcggaatag aaataagccc tactgccgga 2640
ttggtaagcg agattccgtc aaaactcata gtaacggtaa cggacagaag cagttgtaac 2700
tttactgcca accccacatt gaccgacaat accggtaaca gctatcccgt acacaacgat 2760
ttcgactggg gtatcccgga aatgaacaag ttcgtcatca ttcttgacaa cggagccatc 2820
acggctgacg gaatatacac actcactatt cctgccggct cgattatagg tgacgacgag 2880
acagacctca acaaggaaga ctttgtattc atatatacaa tcaatctggc cggaatcaca 2940
gaactcgtca acaatgaggg cggaaaggtt gacgtataca ctctcaaagg aacattgttg 3000
atgaaagacg ccgatgcatc agcagtaaac aagctggcaa aaggcatgta cataatcaat 3060
ggtaaaaaag tgttcataag ataa 3084
<210> 29
<211> 1528
<212> DNA
<213> Paraprevotella clara
<400> 29
atgaagagtt tgatcctggc tcaggatgaa cgctagctac aggcttaaca catgcaagtc 60
gaggggcagc atggactcag ctttgctgag tttgatggcg accggcgcac gggtgagtaa 120
cgcgtatcca acctgccctt tactccggga tagtctcctg aaagggagtt taataccgga 180
tgtgtttgtt tttccgcatg ggagcgacaa ataaagatta attggtaaag gatggggatg 240
cgtcccatta gcttgttggc ggggtaacgg cccaccaagg cgacgatggg taggggttct 300
gagaggaagg tcccccacat tggaactgag acacggtcca aactcctacg ggaggcagca 360
gtgaggaata ttggtcaatg ggcgggagcc tgaaccagcc aagtagcgtg aaggacgacg 420
gccctacggg ttgtaaactt cttttataag ggaataaagt tcgccacgcg tggtgttttg 480
tatgtacctt atgaataagc atcggctaat tccgtgccag cagccgcggt aatacggaag 540
atgcgagcgt tatccggatt tattgggttt aaagggagcg taggcgggct tttaagtcag 600
cggtcaaatg ccacggctca accgtggcca gccgttgaaa ctgtaagcct tgagtctgca 660
cagggcacat ggaattcgtg gtgtagcggt gaaatgctta gatatcacga agaactccga 720
tcgcgaaggc attgtgccgg ggcagcactg acgctgaggc tcgaaagtgc gggtatcaaa 780
caggattaga taccctggta gtccgcacgg taaacgatga atgctcgcta tgggcgatat 840
attgtccgtg gccaagcgaa agcgttaagc attccacctg gggagtacgc cggcaacggt 900
gaaactcaaa ggaattgacg ggggcccgca caagcggagg aacatgtggt ttaattcgat 960
gatacgcgag gaaccttacc cgggcttgaa ttgcaggtgc atgagtcaga gacggctctt 1020
tccttcggga ctcctgtgaa ggtgctgcat ggttgtcgtc agctcgtgcc gtgaggtgtc 1080
ggcttaagtg ccataacgag cgcaaccctt ctccccagtt gccatcgggt aatgccgggc 1140
cctctgggga cactgccatc gtaagatgcg aggaaggtgg ggatgacgtc aaatcagcac 1200
ggcccttacg tccggggcta cacacgtgtt acaatggggg gtacagaggg ccgctgtccg 1260
gtgacggtcg gccaatccct aaaacccctc tcagttcgga ctggagtctg caacccgact 1320
ccacgaagct ggattcgcta gtaatcgcgc atcagccatg gcgcggtgaa tacgttcccg 1380
ggccttgtac acaccgcccg tcaagccatg aaagccgggg gtgcctgaag tccgtgaccg 1440
cgagggtcgg cctagggtaa aactggtgat tggggctaag tcgtaacaag gtagccgtac 1500
cggaaggtgc ggctggaaca cctccttt 1528
<210> 30
<211> 1481
<212> DNA
<213> Paraprevotella xylaniphila
<400> 30
gatgaacgct agctacaggc ttaacacatg caagtcgagg ggcagcatga acttagcttg 60
ctaagtttga tggcgaccgg cgcacgggtg agtaacgcgt atccaacctg ccctttacgc 120
ggggatagcc ttctgaaagg aagtttaata cccgatgaat tcgtttagtc gcatggcttg 180
atgaataaag atttatcagt aaaggatggg gatgcgtccc attagcttgt tggcggggta 240
acggcccacc aaggcgacga tgggtagggg ttctgagagg aaggtccccc acattggaac 300
tgagacacgg tccaaactcc tacgggaggc agcagtgagg aatattggtc aatgggcgcg 360
agcctgaacc agccaagtag cgtggaggac gacggcccta cgggttgtaa actcctttta 420
taaggggata aagttggcca tgtatggcca tttgcaggta ccttatgaat aagcatcggc 480
taattccgtg ccagcagccg cggtaatacg gaagatgcga gcgttatccg gatttattgg 540
gtttaaaggg agcgtaggcg ggcagtcaag tcagcggtca aatggcgcgg ctcaaccgcg 600
ttccgccgtt gaaactggca gccttgagta tgcacagggt acatggaatt cgtggtgtag 660
cggtgaaatg cttagatatc acgaggaact ccgatcgcgc aggcattgta ccggggcatt 720
actgacgctg aggctcgaag gtgcgggtat caaacaggat tagataccct ggtagtccgc 780
acagtaaacg atgaatgccc gctgtcggcg acatagtgtc ggcggccaag cgaaagcgtt 840
aagcattcca cctggggagt acgccggcaa cggtgaaact caaaggaatt gacgggggcc 900
cgcacaagcg gaggaacatg tggtttaatt cgatgatacg cgaggaacct tacccgggct 960
tgaatcgcag gtgcatgggc cggagacggc cctttccttc gggactcctg cgaaggtgct 1020
gcatggttgt cgtcagctcg tgccgtgagg tgtcggctta agtgccataa cgagcgcaac 1080
ccccctcccc agttgccatc gggtaatgcc gggcactttg gggacactgc caccgcaagg 1140
tgcgaggaag gtggggatga cgtcaaatca gcacggccct tacgtccggg gcgacacacg 1200
tgttacaatg gggggtacag agggccgctg cccggtgacg gttggccaat ccctaaagcc 1260
cctctcagtt cggactggag tctgcaaccc gactccacga agctggattc gctagtaatc 1320
gcgcatcagc catggcgcgg tgaatacgtt cccgggcctt gtacacaccg cccgtcaagc 1380
catgaaagcc gggggtgcct gaagtccgtg accgcgaggg tcggcctagg gtaaaaccgg 1440
tgattggggc taagtcgtaa caaggtagcc gtaccggaag g 1481
<210> 31
<211> 1535
<212> DNA
<213> Prevotella rara
<400> 31
ttttacagtg gagagtttga tcctggctca ggatgaacgc tagctacagg cttaacacat 60
gcaagtcgcg gggcagcatg aaggatgctt gcattctttg atggcgaccg gcgcacgggt 120
gagtaacgcg tatccaacct gcctgttacc acggcacagc ccgtcgaaag gcggattaat 180
gccgtatgtg gtcttgaaag ggcatctgat catgactaaa ggttcagcgg taacggatgg 240
ggatgcgtcc gattagctag acggcggggt aacggcccac cgtggcgacg atcggtaggg 300
gttctgagag gaaggtcccc cacactggaa ctgagacacg gtccagactc ctacgggagg 360
cagcagtgag gaatattggt caatgggcgt aagcctgaac cagccaagta gcgtgaggga 420
agactgccct atgggttgta aacctctttt gtgcggggat aaagtgaggg acgtgtccct 480
tattgcaggt accgcacgaa taaggaccgg ctaattccgt gccagcagcc gcggtaatac 540
ggaaggtccg ggcgttatcc ggatttattg ggtttaaagg gagcgcaggc cgtcctttaa 600
gcgtgctgtg aaatgccgcg gctcaaccgt ggcactgcag cgcgaactgg aggacttgag 660
tacgcacgag gtaggcggaa ttcgtggtgt agcggtgaaa tgcttagata tcacgaagaa 720
ctccgattgc gaaggcagct taccggagcg caactgacgc tgaggctcga aagcgcgggt 780
atcgaacagg attagatacc ctggtagtcc gcgcggtaaa cgatggatgc ccgccgttgg 840
gatattgatt tcagcggcca agcgaaagcg ttaagcatcc cacctgggga gtacgccggc 900
aacggtgaaa ctcaaaggaa ttgacggggg cccgcacaag cggaggaaca tgtggtttaa 960
ttcgatgata cgcgaggaac cttacccggg cttgaattgc aggagaacga tccagagatg 1020
gtgaggccct tcggggctcc tgtgaaggtg ctgcatggtt gtcgtcagct cgtgccgtga 1080
ggtgtcggct caagtgccat aacgagcgca acccctgtcc atagttgcca tcaggtttgc 1140
tgggcactct gtggagactg ccgccgtaag gtgtgaggaa ggtggggatg acgtcaaatc 1200
agcacggccc ttacgtccgg ggctacacac gtgttacaat ggggcataca gcgagttgga 1260
tgtgcgcaag tacgtccgga tcaataaagt gcctctcagt tcggactggg gtctgcaacc 1320
cgaccccacg aagctggatt cgctagtaat cgcgcatcag ccatggcgcg gtgaatacgt 1380
tcccgggcct tgtacacacc gcccgtcaag ccatgaaagc cgggggcgcc tgaagtccgt 1440
gaccgcgagg gtcggcctag ggtgaaaccg gtgattgggg ctaagtcgta acaaggtagc 1500
cgtaccggaa ggtgcggctg gaacacctcc tttct 1535
<210> 32
<211> 1438
<212> DNA
<213> Prevotella rodentium
<400> 32
tgatggcgac cggcgcacgg gtgagtaacg cgtatccaac ctgcccttta ctatgggaca 60
gcccgtcgaa aggcggatta ataccgtatg ttgtcatgtt gatgcatatt ttcatgacca 120
aaggcttcgg ccggtaaagg atggggatgc gtctgattag cttgccggcg gggtaacggc 180
ccaccggggc gacgatcagt aggggttctg agaggaaggt cccccacatt ggaactgaga 240
cacggtccaa actcctacgg gaggcagcag tgaggaatat tggtcaatgg gcgcgagcct 300
gaaccagcca agtagcgtgc aggacgacgg ccctatgggt tgtaaactgc ttttatacgg 360
ggataaagta tgccacgtgt ggtttattgc aggtaccgta tgaataagga ccggctaatt 420
ccgtgccagc agccgcggta atacggaagg tccgggcgtt atccggattt attgggttta 480
aagggagcgt aggccgggtt ttaagcgtgc cgtgaaatgt cggggctcaa ccttgacact 540
gcggcgcgaa ctggagtcct tgagtgcgcg gaacgtatgc ggaattcgtg gtgtagcggt 600
gaaatgctta gatatcacga agaaccccga ttgcgaaggc agcatacggc agcgctactg 660
acgctgaagc tcgaaggcgc gggtatcgaa caggattaga taccctggta gtccgcgcgg 720
taaacgatgg atgcccgctg ttggcgatat attgtcagcg gccaagcgaa agcgttaagc 780
atcccacctg gggagtacgc cggcaacggt gaaactcaaa ggaattgacg ggggcccgca 840
caagcggagg aacatgtggt ttaattcgat gatacgcgag gaaccttacc cgggcttgaa 900
ctgccggcga acgatccaga gatggtgagg cccttcgggg cgccggcgga ggtgctgcat 960
ggttgtcgtc agctcgtgcc gtgaggtgtc ggcttaagtg ccataacgag cgcaacccct 1020
gttctcagtt gccatcgggt aatgccgggc actctgtgaa gactgcctcc gtaaggagtg 1080
aggagggtgg ggatgacgtc aaatcagcac ggcccttacg tccggggcta cacacgtgtt 1140
acaatgggtg gtacagacgg tcggcgtccg gcaacgtacg tccaatccgt aaagccgccc 1200
tcagttcgga ctggggtctg caacccgacc ccacgaagct ggattcgcta gtaatcgcgc 1260
atcagccatg gcgcggtgaa tacgttcccg ggccttgtac acaccgcccg tcaagccatg 1320
aaagccgggg gcgcctgaag tccgtgaccg cgagggtcgg cctagggtga aaccggtgat 1380
tggggctaag tcgtaacaag gtagccgtac cggaaggtgc ggctggaaca cctccttt 1438
<210> 33
<211> 1531
<212> DNA
<213> Prevotella muris
<400> 33
acaatgtaga gtttgatcct ggctcaggat gaacgctggc tacaggctta acacatgcaa 60
gtcgaggggc agcatgacat gttttcggac gtgtcgatgg cgaccggcgc acgggtgagt 120
aacgcgtatc cgacctgccc cgtaccaggg cacagcccgt cgaaagacgg attaatgccc 180
tatgttctcc ggaagatcca tgttttccgg agcaaaggtt attccggtac gggatgggga 240
tgcgtccgat tagcttgctg gcggggtaac ggcccaccag ggcaccgatc ggtaggggtt 300
ctgagaggaa ggtcccccac actggaactg agacacggtc cagactccta cgggaggcag 360
cagtgaggaa tattggtcaa tggggggaac cctgaaccag ccaagtagcg tgcaggatga 420
cggccctatg ggttgtaaac tgcttttgtg cgtggataaa gtaggccact tgtggccttt 480
tgcaggtacc gcacgaataa ggaccggcta attccgtgcc agcagccgcg gtaatacgga 540
aggtccgggc gttatccgga tttattgggt ttaaagggag cgtaggccgc ctgtcaagcg 600
tgctgtgaaa cgccgtggct caaccacggt cctgcagcgc gaactggcgg gcttgagtgt 660
gcggaaggca cgcggaattc gtggtgtagc ggtgaaatgc ttagatatca cgaagaactc 720
cgattgcgaa ggcagcgtgc cgcagcatta ctgacgctga tgctcgaaag cgcgggtatc 780
gaacaggatt agataccctg gtagtccgcg cggtaaacga tggatgcccg ctgtcggcga 840
tatacagtcg gcggccaagc gaaagcgtta agcatcccac ctggggagta cgccggcaac 900
ggtgaaactc aaaggaattg acgggggccc gcacaagcgg aggaacatgt ggtttaattc 960
gatgatacgc gaggaacctt acccgggctt gaattgcaga tgaatgattc agagatgatg 1020
aagtccttcg ggacatctgt gaaggtgctg catggttgtc gtcagctcgt gccgtgaggt 1080
gtcggcttaa gtgccataac gagcgcaacc ccttccctca gttgccatcg ggtcatgccg 1140
ggcactctgt gggtactgcc tccgcaagga gcgaggaagg cggggatgac gtcaaatcag 1200
cacggccctt acgtccgggg ctacacacgt gttacaatgg ggcatacagc gagcaggtgc 1260
cgtgcaaacg gtgtcgaatc ttgaaagtgc ccctcagttc ggactggggt ctgcaacccg 1320
accccacgaa gctggattcg ctagtaatcg cgcatcagcc atggcgcggt gaatacgttc 1380
ccgggccttg tacacaccgc ccgtcaagcc atgaaagccg ggggcgcctg aagtccgtga 1440
ccgtgagggt cggcctaggg tgaaaccggt gattggggct aagtcgtaac aaggtagccg 1500
taccggaagg tgcggctgga acacctcctt t 1531
<210> 34
<211> 5
<212> PRT
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 34
Gly Gly Gly Gly Ser
1 5
<210> 35
<211> 24
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 35
gtcgtggagt ctactggtgt cttc 24
<210> 36
<211> 24
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 36
gtcatatttc tcgtggttca cacc 24
<210> 37
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 37
tgtgaccctc aatgccagag 20
<210> 38
<211> 19
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 38
agcactgggg catcaacac 19
<210> 39
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 39
agrgtttgat ymtggctcag 20
<210> 40
<211> 22
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 40
tacggytacc ttgttacgac tt 22
<210> 41
<211> 90
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<220>
<221> misc_feature
<222> (30)..(37)
<223> n is a, c, g, or t
<400> 41
aatgatacgg cgaccaccga gatctacacn nnnnnnnaca ctctttccct acacgacgct 60
cttccgatct agrgtttgat ymtggctcag 90
<210> 42
<211> 85
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<220>
<221> misc_feature
<222> (25)..(32)
<223> n is a, c, g, or t
<400> 42
caagcagaag acggcatacg agatnnnnnn nngtgactgg agttcagacg tgtgctcttc 60
cgatcttgct gcctcccgta ggagt 85
<210> 43
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 43
cgtaggagtt tggaccgtgt 20
<210> 44
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 44
gcatgggagc gacaaataaa 20
<210> 45
<211> 21
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 45
aagagtgatt ggcgtccgta c 21
<210> 46
<211> 21
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 46
atggacacgt cactggcaga g 21
<210> 47
<211> 38
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 47
gtggatcccc cgggctgcag catccctcct ccatctgt 38
<210> 48
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 48
ctggaagata ggcaattagt ttgggacgta ggggcaatc 39
<210> 49
<211> 27
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 49
tcccgaccga ctgatccttt ttaaccc 27
<210> 50
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 50
ctgtgcccag taattttgtg 20
<210> 51
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 51
gtggatcccc cgggctgcac cccttgtcca tgttagacg 39
<210> 52
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 52
ctggaagata ggcaattagt gaagtgacag gggatgcta 39
<210> 53
<211> 33
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 53
aaacaaataa atgaagaaaa ttttatcttt gct 33
<210> 54
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 54
ctggaagata ggcaattagt gaagtgacag gggatgcta 39
<210> 55
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 55
gtggatcccc cgggctgcaa cactgatgtc gttgccttt 39
<210> 56
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 56
ctggaagata ggcaattagg caaccttacc acagggaaa 39
<210> 57
<211> 33
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 57
aaacaaataa atgaagaaaa ttttatcttt gct 33
<210> 58
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 58
ctggaagata ggcaattagg caaccttacc acagggaaa 39
<210> 59
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 59
gtggatcccc cgggctgcac aagtagagtt ccggttcgg 39
<210> 60
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 60
ctggaagata ggcaattaga aatcacgacc ctcgatgtg 39
<210> 61
<211> 33
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 61
aaacaaataa atgaagaaaa ttttatcttt gct 33
<210> 62
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 62
ctggaagata ggcaattaga aatcacgacc ctcgatgtg 39
<210> 63
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 63
gtggatcccc cgggctgcaa gttcgttctt ctcgatggc 39
<210> 64
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 64
ctggaagata ggcaattagc aaccgcgggt ataacaaga 39
<210> 65
<211> 33
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 65
aaacaaataa atgaagaaaa ttttatcttt gct 33
<210> 66
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 66
ctggaagata ggcaattagc aaccgcgggt ataacaaga 39
<210> 67
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 67
gtggatcccc cgggctgcat ggggcgtttg tatttgtca 39
<210> 68
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 68
ctggaagata ggcaattaga aaccgctgac ggtctattc 39
<210> 69
<211> 33
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 69
aaacaaataa atgaagaaaa ttttatcttt gct 33
<210> 70
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 70
ctggaagata ggcaattaga aaccgctgac ggtctattc 39
<210> 71
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 71
gtggatcccc cgggctgcat ccagtcgtca gactttcct 39
<210> 72
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 72
ctggaagata ggcaattagg aagggactaa aatcgccgt 39
<210> 73
<211> 33
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 73
aaacaaataa atgaagaaaa ttttatcttt gct 33
<210> 74
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 74
ctggaagata ggcaattagg aagggactaa aatcgccgt 39
<210> 75
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 75
gtggatcccc cgggctgcag ggtcgcttac attttgcac 39
<210> 76
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 76
ctggaagata ggcaattagt cggtgattat gctttcgca 39
<210> 77
<211> 33
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 77
aaacaaataa atgaagaaaa ttttatcttt gct 33
<210> 78
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 78
ctggaagata ggcaattagt cggtgattat gctttcgca 39
<210> 79
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 79
gtggatcccc cgggctgcac ttgcttcaaa tcacgctgg 39
<210> 80
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 80
ctggaagata ggcaattagg ttgcgtaatg tgaacggtg 39
<210> 81
<211> 33
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 81
aaacaaataa atgaagaaaa ttttatcttt gct 33
<210> 82
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 82
ctggaagata ggcaattagg ttgcgtaatg tgaacggtg 39
<210> 83
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 83
gtggatcccc cgggctgcat gtcgaaaggc atcatctgg 39
<210> 84
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 84
ctggaagata ggcaattaga ccattacggc cattaccac 39
<210> 85
<211> 33
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 85
aaacaaataa atgaagaaaa ttttatcttt gct 33
<210> 86
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 86
ctggaagata ggcaattaga ccattacggc cattaccac 39
<210> 87
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 87
gtggatcccc cgggctgcag tgccagtttc aaccaatcg 39
<210> 88
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 88
ctggaagata ggcaattagc ttgtacccca gcagatgtc 39
<210> 89
<211> 33
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 89
aaacaaataa atgaagaaaa ttttatcttt gct 33
<210> 90
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 90
ctggaagata ggcaattagg taagctggcg cttgaaatc 39
<210> 91
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 91
gtggatcccc cgggctgcat tgttttggtt gtaactgcc 39
<210> 92
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 92
ctggaagata ggcaattagg tcaatacgaa ctgcaactg 39
<210> 93
<211> 33
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 93
aaacaaataa atgaagaaaa ttttatcttt gct 33
<210> 94
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 94
ctggaagata ggcaattagg tcaatacgaa ctgcaactg 39
<210> 95
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 95
gtggatcccc cgggctgcag accaattaag cacgtcaag 39
<210> 96
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 96
ctggaagata ggcaattagt ggtacaattc gctcaaaga 39
<210> 97
<211> 33
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 97
aaacaaataa atgaagaaaa ttttatcttt gct 33
<210> 98
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 98
ctggaagata ggcaattagt ggtacaattc gctcaaaga 39
<210> 99
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 99
gtggatcccc cgggctgcat actcccgtgt ttttagtgg 39
<210> 100
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 100
ctggaagata ggcaattagg gatgccagtt acctgatag 39
<210> 101
<211> 33
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 101
aaacaaataa atgaagaaaa ttttatcttt gct 33
<210> 102
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 102
ctggaagata ggcaattagg gatgccagtt acctgatag 39
<210> 103
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 103
gtggatcccc cgggctgcat caaaccacac acagagaaa 39
<210> 104
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 104
ctggaagata ggcaattagg aattcggcca aaccttatg 39
<210> 105
<211> 33
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 105
aaacaaataa atgaagaaaa ttttatcttt gct 33
<210> 106
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 106
ctggaagata ggcaattagg aattcggcca aaccttatg 39
<210> 107
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 107
gtggatcccc cgggctgcaa gacaatcgag gcatcatac 39
<210> 108
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 108
ctggaagata ggcaattaga tagcgtgaat ggaaaacct 39
<210> 109
<211> 33
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 109
aaacaaataa atgaagaaaa ttttatcttt gct 33
<210> 110
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 110
ctggaagata ggcaattaga tagcgtgaat ggaaaacct 39
<210> 111
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 111
gtggatcccc cgggctgcaa tctatctcaa cctcctgct 39
<210> 112
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 112
ctggaagata ggcaattagg aaggaatcag cgtagatgt 39
<210> 113
<211> 33
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 113
aaacaaataa atgaagaaaa ttttatcttt gct 33
<210> 114
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 114
ctggaagata ggcaattagg aaggaatcag cgtagatgt 39
<210> 115
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 115
gtggatcccc cgggctgcat gtgccgataa cctgaatgg 39
<210> 116
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 116
ctggaagata ggcaattagc agagaaaagg gctggtgtt 39
<210> 117
<211> 33
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 117
aaacaaataa atgaagaaaa ttttatcttt gct 33
<210> 118
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 118
ctggaagata ggcaattagc agagaaaagg gctggtgtt 39
<210> 119
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 119
tcccgaccga ggtctttgtc cccttctccg 30
<210> 120
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 120
tcttacaccg gcgggttgct ttttatgcct 30
<210> 121
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 121
gcctgatgca tggaaaaacc 20
<210> 122
<211> 19
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 122
ctggaagata ggcaattag 19
<210> 123
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 123
gtggatcccc cgggctgcac ggataacgga attacccat 39
<210> 124
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 124
ctggaagata ggcaattagt ctgacaaagg ttggttgaa 39
<210> 125
<211> 33
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 125
aaacaaataa atgaagaaaa ttttatcttt gct 33
<210> 126
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 126
ctggaagata ggcaattagt ctgacaaagg ttggttgaa 39
<210> 127
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 127
gtggatcccc cgggctgcat gtataacccc cagaaggaa 39
<210> 128
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 128
ctggaagata ggcaattaga acaataccgt caacctctc 39
<210> 129
<211> 33
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 129
aaacaaataa atgaagaaaa ttttatcttt gct 33
<210> 130
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 130
ctggaagata ggcaattaga acaataccgt caacctctc 39
<210> 131
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 131
gtggatcccc cgggctgcat ttattagccg tggtatcgg 39
<210> 132
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 132
ctggaagata ggcaattaga ctccaacagc atcttcaat 39
<210> 133
<211> 33
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 133
aaacaaataa atgaagaaaa ttttatcttt gct 33
<210> 134
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 134
ctggaagata ggcaattaga ctccaacagc atcttcaat 39
<210> 135
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 135
gtggatcccc cgggctgcag tagtgtatcg ctttgtcct 39
<210> 136
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 136
ctggaagata ggcaattaga ttcattactg gtcggaacc 39
<210> 137
<211> 33
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 137
aaacaaataa atgaagaaaa ttttatcttt gct 33
<210> 138
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 138
ctggaagata ggcaattaga ttcattactg gtcggaacc 39
<210> 139
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 139
gtggatcccc cgggctgcag ggatactgaa tgtctgtcc 39
<210> 140
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 140
ctggaagata ggcaattaga cagtgtaatc atcaacgct 39
<210> 141
<211> 33
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 141
aaacaaataa atgaagaaaa ttttatcttt gct 33
<210> 142
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 142
ctggaagata ggcaattaga cagtgtaatc atcaacgct 39
<210> 143
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 143
gtggatcccc cgggctgcaa ttttcaatcc agtccctcc 39
<210> 144
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 144
ctggaagata ggcaattagt acagtttgcc gataccaaa 39
<210> 145
<211> 33
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 145
aaacaaataa atgaagaaaa ttttatcttt gct 33
<210> 146
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 146
ctggaagata ggcaattagt acagtttgcc gataccaaa 39
<210> 147
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 147
gtggatcccc cgggctgcat atgaactaca aagtcatcg 39
<210> 148
<211> 27
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 148
ttcaagaagc accaccgtta tgatata 27
<210> 149
<211> 27
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 149
taacggtggt gcttcttgaa aattaag 27
<210> 150
<211> 35
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 150
ctggaagata ggcaattagg acgatttcgt ggaca 35
<210> 151
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 151
gtggatcccc cgggctgcat atgaactaca aagtcatcg 39
<210> 152
<211> 35
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 152
ctggaagata ggcaattagg acgatttcgt ggaca 35
<210> 153
<211> 19
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 153
tcaacgcacc attactttc 19
<210> 154
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 154
atgaaaaaca gtttacttgc 20
<210> 155
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 155
gtggatcccc cgggctgcat acaatccgtt ggaactgtc 39
<210> 156
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 156
attagtattt cagcattcat ttgctggtga 30
<210> 157
<211> 40
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 157
atgaatgctg aaatactaat tttagataga ctataatttg 40
<210> 158
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 158
ctggaagata ggcaattagg aactggtcgt aaaaatcac 39
<210> 159
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 159
gtggatcccc cgggctgcat acaatccgtt ggaactgtc 39
<210> 160
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 160
ctggaagata ggcaattagg aactggtcgt aaaaatcac 39
<210> 161
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 161
gtggatcccc cgggctgcat ccagtcgtca gactttcct 39
<210> 162
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 162
ctggaagata ggcaattagg aagggactaa aatcgccgt 39
<210> 163
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 163
gtggatcccc cgggctgcaa tccttccact tttcctctc 39
<210> 164
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 164
aattgatcgt tctaaagccg acgatttaag 30
<210> 165
<211> 35
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 165
cggctttaga acgatcaatt ttaaaatgaa tatat 35
<210> 166
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 166
ctggaagata ggcaattagt tcaacgaaaa cagaacatc 39
<210> 167
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 167
gtggatcccc cgggctgcaa tccttccact tttcctctc 39
<210> 168
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 168
ctggaagata ggcaattagt tcaacgaaaa cagaacatc 39
<210> 169
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 169
gtggatcccc cgggctgcaa gacaatcgag gcatcatac 39
<210> 170
<211> 39
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 170
ctggaagata ggcaattaga tagcgtgaat ggaaaacct 39
<210> 171
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 171
attcgtaaaa caccaccacc accaccactg 30
<210> 172
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 172
tcacgccgtc catggtatat ctccttctta 30
<210> 173
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 173
atataccatg gacggcgtga gccaacctac 30
<210> 174
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 174
ggtggtggtg ttttacgaat actttcttca 30
<210> 175
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 175
gatggtaaaa caccaccacc accaccactg 30
<210> 176
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 176
cttcaccaac catggtatat ctccttctta 30
<210> 177
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 177
atataccatg gttggtgaag atttacattt 30
<210> 178
<211> 30
<212> DNA
<213> Artificial Sequence
<220>
<223> Synthesized oligonucleotide
<400> 178
ggtggtggtg ttttaccatc atcttcttac 30

Claims (22)

1. A composition for degrading trypsin or TMPRSS2, the composition comprising as active ingredients the following bacteria:
a bacterium having 00502 protein or a bacterium having a protein which has 30% or more of sequence identity with the amino acid sequence of 00502 protein and has trypsin-binding ability; or alternatively
Bacteria having 00509 protein or bacteria having proteins having 30% or more sequence identity with the amino acid sequence of 00509 protein and having trypsin binding capacity.
2. A composition for degrading trypsin or TMPRSS2, the composition comprising as active ingredients the following bacteria:
a bacterium having 00502 protein or a bacterium having a protein which has 30% or more of sequence identity with the amino acid sequence of 00502 protein and has trypsin-binding ability; and
bacteria having 00509 protein or bacteria having proteins having 30% or more sequence identity with the amino acid sequence of 00509 protein and having trypsin binding capacity.
3. A composition for degrading trypsin or TMPRSS2, the composition comprising as active ingredients the following bacteria:
a bacterium having a gene comprising a base sequence shown in SEQ ID NO. 1 or a bacterium having a gene which has 30% or more sequence identity with a base sequence shown in SEQ ID NO. 1 and encodes a protein having trypsin binding ability; or alternatively
Bacteria having a gene comprising a base sequence shown as SEQ ID NO. 2 or bacteria having a gene which has 30% or more sequence identity with a base sequence shown as SEQ ID NO. 2 and encodes a protein having trypsin binding ability.
4. A composition for degrading trypsin or TMPRSS2, the composition comprising as active ingredients the following bacteria:
a bacterium having a gene comprising a base sequence shown in SEQ ID NO. 1 or a bacterium having a gene which has 30% or more sequence identity with a base sequence shown in SEQ ID NO. 1 and encodes a protein having trypsin binding ability; and
bacteria having a gene comprising a base sequence shown as SEQ ID NO. 2 or bacteria having a gene which has 30% or more sequence identity with a base sequence shown as SEQ ID NO. 2 and encodes a protein having trypsin binding ability.
5. The composition for degrading trypsin or TMPRSS2 of any one of claims 1-4, wherein the bacteria has a type IX secretion system (T9 SS).
6. The composition for degrading trypsin or TMPRSS2 of claim 5, wherein the T9SS comprises PorV protein, porU protein, porN protein, porM protein, porL protein, porK protein, or PorP protein.
7. The composition for degrading trypsin or TMPRSS2 according to any of claims 1-6, wherein the bacterium is a bacterium of the genus Paraprevatella, prevoltella, prevoltelamasiia or Bactoidetes.
8. The composition for degrading trypsin or TMPRSS2 according to claim 7, wherein the bacterium of the genus Paraprevatella is Paraprevotella clara and the bacterium of the genus Prevoltella is at least one bacterium selected from the group consisting of Prevoltella rara, prevotella rodentium and Prevoltella muris.
9. The composition for degrading trypsin or TMPRSS2 according to any of claims 1-8, wherein the bacterium is a bacterium having a 16S rRNA gene comprising a base sequence as shown in SEQ ID NO. 3 or SEQ ID NO. 4 or a bacterium having a 16S rRNA gene comprising a base sequence having 97% or more sequence identity with the base sequence as shown in SEQ ID NO. 3 or SEQ ID NO. 4.
10. The composition for degrading trypsin or TMPRSS2 according to any one of claims 1-6, wherein the bacteria is at least one bacteria selected from the group consisting of paraprevatella sp.msp 0303, paraprevatella sp.msp 0335, prevotellamassilia timonensis, bacteriodes sp.msp 0288, bacteriodes sp.msp 0410, bacteriodes sp.msp 0435 and Porphyromonas gingivalis.
11. The composition for degrading trypsin or TMPRSS2 of any one of claims 1-10, wherein the bacteria is a viable bacteria.
12. The composition for degrading trypsin or TMPRSS2 of any one of claims 1-10, wherein the bacteria is dead bacteria.
13. A composition for degrading trypsin or TMPRSS2, comprising as active ingredients the following proteins:
00502 protein or a protein having 30% or more sequence identity to the amino acid sequence of said 00502 protein and having trypsin binding capacity; or alternatively
00509 protein or a protein having 30% or more sequence identity to the amino acid sequence of said 00509 protein and having trypsin binding capacity.
14. A composition for degrading trypsin or TMPRSS2, comprising as active ingredients the following proteins:
00502 protein or a protein having 30% or more sequence identity to the amino acid sequence of said 00502 protein and having trypsin binding capacity; and
00509 protein or a protein having 30% or more sequence identity to the amino acid sequence of said 00509 protein and having trypsin binding capacity.
15. A composition for degrading trypsin or TMPRSS2 according to any one of claims 1-14 for use in treating a disease caused by trypsin or TMPRSS 2.
16. The composition for degrading trypsin or TMPRSS2 of any one of claims 1-15, wherein the disease caused by trypsin or TMPRSS2 is inflammatory bowel disease (ulcerative colitis, crohn's disease), irritable bowel disease, infection, acute pancreatitis, or chronic pancreatitis.
17. The composition for degrading trypsin or TMPRSS2 of claim 16, wherein the infection is a viral infection or a bacterial infection.
18. The composition for degrading trypsin or TMPRSS2 of claim 16 or 17, wherein the inflammatory bowel disease, irritable bowel disease or infection is a disease involving TMPRSS2 or IgA.
19. A diagnostic agent for a disease caused by trypsin or TMPRSS2, the diagnostic agent comprising a specific binding substance for detecting:
00502 protein or a protein having 30% or more sequence identity to the amino acid sequence of 00502 protein and having trypsin binding capacity; or alternatively
00509 protein or a protein having 30% or more sequence identity to the amino acid sequence of 00509 protein and having trypsin binding capacity.
20. A diagnostic agent for a disease caused by trypsin or TMPRSS2, the diagnostic agent comprising a primer set or probe for detecting:
a gene comprising a base sequence shown as SEQ ID NO. 1 or a gene having 30% or more sequence identity with a base sequence shown as SEQ ID NO. 1 and encoding a protein having trypsin binding ability; or alternatively
A gene comprising a base sequence shown as SEQ ID NO. 2 or a gene having 30% or more sequence identity with a base sequence shown as SEQ ID NO. 2 and encoding a protein having trypsin binding ability.
21. A quasi-drug for diseases caused by trypsin or TMPRSS2, which contains bacteria having the following proteins as active ingredients:
00502 protein or a protein having 30% or more sequence identity to the amino acid sequence of 00502 protein and having trypsin binding capacity; or alternatively
00509 protein or a protein having 30% or more sequence identity to the amino acid sequence of 00509 protein and having trypsin binding capacity.
22. A quasi drug for diseases caused by trypsin or TMPRSS2, which contains as active ingredients the following proteins:
00502 protein or a protein having 30% or more sequence identity to the amino acid sequence of said 00502 protein and having trypsin binding capacity; or alternatively
00509 protein or a protein having 30% or more sequence identity to the amino acid sequence of said 00509 protein and having trypsin binding capacity.
CN202280010679.5A 2021-01-19 2022-01-18 Compositions for degrading trypsin or TMPRSS2 Pending CN117043341A (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US63/138,798 2021-01-19
US202163229077P 2021-08-04 2021-08-04
US63/229,077 2021-08-04
PCT/JP2022/001494 WO2022158434A1 (en) 2021-01-19 2022-01-18 Composition for decomposition of trypsin or tmprss2

Publications (1)

Publication Number Publication Date
CN117043341A true CN117043341A (en) 2023-11-10

Family

ID=88639536

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202280010679.5A Pending CN117043341A (en) 2021-01-19 2022-01-18 Compositions for degrading trypsin or TMPRSS2

Country Status (1)

Country Link
CN (1) CN117043341A (en)

Similar Documents

Publication Publication Date Title
Uppalapati et al. The outer membrane proteins OmpA, CarO, and OprD of Acinetobacter baumannii confer a two-pronged defense in facilitating its success as a potent human pathogen
Anderson et al. Shigella sonnei encodes a functional T6SS used for interbacterial competition and niche occupancy
Humann et al. Bacterial peptidoglycan-degrading enzymes and their impact on host muropeptide detection
Bonis et al. A M23B family metallopeptidase of Helicobacter pylori required for cell shape, pole formation and virulence
Phalipon et al. Shigella’s ways of manipulating the host intestinal innate and adaptive immune system: a tool box for survival?
US9529005B2 (en) Modulating bacterial MAM polypeptides in pathogenic disease
Frirdich et al. The Campylobacter jejuni helical to coccoid transition involves changes to peptidoglycan and the ability to elicit an immune response
Li et al. Identification of trypsin-degrading commensals in the large intestine
US20220370547A1 (en) Lantibiotics, lantibiotic-producing bacteria, compositions and methods of production and use thereof
KR100988771B1 (en) Novel Lysin Protein Having Killing Activity Specific to Enterococcus and Streptococcus
Feng et al. Aeromonas hydrophila Ssp1: a secretory serine protease that disrupts tight junction integrity and is essential for host infection
Liu et al. Pseudomonas fluorescens: identification of Fur-regulated proteins and evaluation of their contribution to pathogenesis
JP7399400B2 (en) A composition for suppressing trypsin activity containing bacteria belonging to the genus Paraprevotella as an active ingredient
Yan et al. Staphylococcus aureus VraX specifically inhibits the classical pathway of complement by binding to C1q
US11918610B2 (en) Methods for diagnosis and treatment of type 1 diabetes
RU2615140C2 (en) New bacterium and its extracts and their application in therapy
Hack et al. Inactivation of human coagulation factor X by a protease of the pathogen Capnocytophaga canimorsus
CN117043341A (en) Compositions for degrading trypsin or TMPRSS2
JP5989958B2 (en) Novel method for treating H. pylori infection
WO2022158434A1 (en) Composition for decomposition of trypsin or tmprss2
US20070042448A1 (en) Use of enzymes from Helicobacter pylori as therapeutical targets
JP5265188B2 (en) Method for identifying iron transport inhibitors in Staphylococcus aureus
US20220265731A1 (en) Modified escherichia coli strain nissle and treatment of gastrointestinal disorder
CN115038799A (en) Methods of treating bacterial infections
Evans et al. Siva R. Uppalapati, Abhiroop Sett and Ranjana Pathania

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination