CN101501696A - Glycan data mining system - Google Patents

Glycan data mining system Download PDF

Info

Publication number
CN101501696A
CN101501696A CNA2007800300127A CN200780030012A CN101501696A CN 101501696 A CN101501696 A CN 101501696A CN A2007800300127 A CNA2007800300127 A CN A2007800300127A CN 200780030012 A CN200780030012 A CN 200780030012A CN 101501696 A CN101501696 A CN 101501696A
Authority
CN
China
Prior art keywords
glycan
feature
conjunction
sialylated
combination
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA2007800300127A
Other languages
Chinese (zh)
Inventor
拉姆·萨西谢卡尔安
S·拉古拉姆
马哈德万·文卡塔拉曼
苏布拉马尼安·考恩丁亚
拉胡尔·拉曼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Massachusetts Institute of Technology
Original Assignee
Massachusetts Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Massachusetts Institute of Technology filed Critical Massachusetts Institute of Technology
Publication of CN101501696A publication Critical patent/CN101501696A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Peptides Or Proteins (AREA)

Abstract

The present invention provides a system for analyzing glycans and their interaction partners. The inventive system is particularly useful in the identification and analysis of glycoprotein binding interactions.

Description

Glycan data mining system
The opinion of right of priority
The application's case is advocated the U.S. Provisional Patent Application case the 60/837th common co-pending of application on August 14th, 2006 according to 35 USC 119 (e), the right of priority that No. the 60/837th, 869, the U.S. Provisional Patent Application case common co-pending of No. 868 and on August 14th, 2006 application.The complete content of each is to be incorporated herein by reference in these existing application cases.
Government supports
The present invention carries out under the U.S. government that the U.S. state-run synthetic medicine research institute (National Institute of General MedicalSciences) is authorized according to contract number GM57073 according to contract number U54 GM62116 and NIH (National Institutes of Health) supports.U.S. government enjoys some right to the present invention.
Background technology
It is the structure-emic integrated approach of a kind of related complicated carbohydrates or glycan that sugar group is learned, and it just occurs as back genome cell and molecular biological a kind of important example.In several years, glycan is in obviously increasing such as the known organism effect in the basic bioprocess such as cell g and D, tumor growth and transfer, anti-freezing, immunity identification/reaction, cell-cell communication and microorganism mechanism of causing a disease in the past.Glycan is the key component at the interface between cell surface and cell and its extracellular environment.Therefore, glycan is with to regulate it active and influence interacting such as multiple proteins such as growth factor, cell factor, immunity receptor and enzymes of above-mentioned bioprocess thus.
Therefore, need to differentiate and/or characterize the glycan binding ability.
Summary of the invention
The invention provides a kind of system that is used to analyze glycan and its interaction counter pair.System of the present invention is particularly useful for differentiating and the analyzing glucoprotein binding interactions.As described herein, system of the present invention has been applied in several different glycoprotein analyses, thereby all successfully differentiates interaction characteristic in each situation.Therefore, the principle of system of the present invention is widely used in the glycan interaction.
Description of drawings
The data mining platform that Fig. 1: Fig. 1 explanation is utilized herein.The primary clustering of showing data mining platform among the figure A.Feature is to obtain from the data of database object from extracting.With employed data set in the described feature prepared composition class methods with derivation pattern or rule.Figure B shows makes the user data digging method can be applied to some software module of glycan array data.
Fig. 2; Fig. 2 provides the schematic description of feature.Show the representative high mannose primitive be used to illustrate to the definition of, triplet and tetrad among the figure A.Figure B shows representative O connection glycan [core 2] primitive that is used to illustrate different triplet kinds.Use following symbol nomenclature to represent monose: ● Man zero Gal ■ GlcNAc ▲ Fuc ◆ Neu5Ac ◇ Neu5Gc ◆ KDN.
Fig. 3: Fig. 3 describes the classification of high-affinity combination.Show signal to noise ratio (S/N ratio) [y axle] among this figure at the galactose agglutinin-3 of the glycan in the glycan array [in the x axle, numbering in regular turn] screening.Red dotted line indication is defined arbitrarily is used to classify as the critical value of the glycan of high-affinity bond.These critical values are to specify at each GBP that is used for analyzing.
Fig. 4: Fig. 4 describes and contains the Louis xPrimitive (Lewis xMotif) glycan structures Gal β 4 (Fuc α 3) GlcNAc β 3Gal combines with CRD (the band shape line in the array) of DC-SIGN.Monose and binding all are marked on the glycan structures.The 3-OH that closes on the Gal of glycan binding site represents with red circle.Therefore, any replacement in this hydroxyl all will have adverse effect to the combination that conforms to the classifying rules that obtains from data mining.
The comparison of the exemplary sequence of Fig. 5: wild type HA.Sequence be from NCBI influenza virus sequence library ( Http:// www.ncbi.nlm.nih.gov/genomes/FLU/FLU.html) obtain.
Fig. 6. be used to understand the framework of glycan receptor-specific.α 2-3-and/or α 2-6-connect glycan can adopt different topology.According to the present invention, the HA polypeptide is given it in conjunction with the ability of some described topology and is mediated the ability that different hosts (for example, the mankind) infect.As illustrated among this figure, the present invention specifies two kinds of relevant especially topologys: " taper " topological sum " umbrella shape " topology.α 2-3-and/or α 2-6-connect glycan all can adopt the taper topology, and it is short chain oligosaccharides that is connected to core or branch oligosaccharides peculiar (but some long-chain oligosaccharides also can adopt this topology).Only α 2-6-connects glycan and can adopt umbrella shape topology (this may increase owing to the conformation diversity that the extra C5-C6 key that exists in the α 2-6 binding is provided), and the umbrella shape topology mainly be by the long-chain oligosaccharides or have long oligosaccharides branch, the branch glycan that especially contains Neu5Ac α 2-6Gal β 1-3/4GlcNAc-primitive adopts.As described herein, the HA polypeptide is given in conjunction with human receptor and/or the para-infectious ability of Mediated Human in conjunction with the ability of umbrella shape glycan topology.
Fig. 7 .HA residue and taper and the interactional comparison of umbrella shape glycan topology.The HA-glycan is eutectiferous to be the analysis showed that, with respect to the position of the Neu5Ac of HA binding site no change almost.Relate to such as the conservative residue of F98, S/T136, W153, H183 and L/I194 equal altitudes with contacting of Neu5Ac.Relate to different residues with the contact of other sugar, this depends on that sugared binding is that α 2-3 or α 2-6 and glycan topology are taper or umbrella shape.For instance, in the taper topology, mainly contact Neu5Ac and Gal sugar.E190 and Q226 play the effect of particular importance for this combination.This figure also illustrates other position (for example, 137,145,186,187,193,222) that can participate in conjunction with pyramidal structure.In some cases, different residues can cause the different contacts with different glycan structures.Amino acid whose type in these positions can influence HA polypeptide and the ability that has the receptors bind of different modifying and/or branching pattern in glycan structures.In the umbrella shape topology, contact is to take place with sugar except that Neu5Ac and Gal.This figure explanation can participate in the residue (for example, 137,145,156,159,186,187,189,190,192,193,196,222,225,226) in conjunction with beveled structure.In some cases, different residues can cause the different contacts with different glycan structures.Amino acid whose type in these positions can influence HA polypeptide and the ability that has the receptors bind of different modifying and/or branching pattern in glycan structures.In certain embodiments, 190 D residue and/or 225 s' D residue causes and the combining of umbrella shape topology.
Fig. 8. exemplary taper topology.This figure illustrates some exemplary (but not exhaustive) glycan structures that adopts the taper topology.
Fig. 9. exemplary umbrella shape topology.This figure illustrates some exemplary (but not exhaustive) glycan structures that adopts the umbrella shape topology.
The sequence alignment of Figure 10 .HA glycan binding structural domain.Grey: the conserved amino acid that bound sialic acid is related.Red: in conjunction with the related specific amino acids of Neu5Ac α 2-3/6Gal primitive.Yellow: the amino acid that influences Q226 (137,138) and E190 (186,228) location.Green: as to combine related amino acid with other monose that is connected to Neu5Ac α 2-3/6Gal primitive (or modification).The sequence of ASI30, APR34, ADU63, ADS97 and Viet04 is to obtain from its crystal structure separately.Other sequence be from SwissProt ( Http: ∥ us.expasy.org) obtain.Abbreviation: ADA76, A/ duck/Alberta (Alberta)/35/76 (H1N1); ASI30, A/ pig/Iowa (Iowa)/30 (H1N1); APR34, A/ Puerto Rico (Puerto Rico)/8/34 (H1N1); ASC18, the A/ South Carolina (South Carolina)/1/18 (H1N1); AT91, A/ Texas (Texas)/36/91 (H1N1); ANY18, A/ New York (New York)/1/18 (H1N1); ADU63, A/ duck/Ukraine (Ukraine)/1/63 (H3N8); AAI68, A/ likes to know (Aichi)/2/68 (H3N2); AM99, A/ Moscow (Moscow)/10/99 (H3N2); ADS97, A/ duck/Singapore (Singapore)/3/97 (H5N3); Viet04, A/ Vietnam (Vietnam)/1203/2004 (H5N1).
Figure 11. the sequence alignment of the conservative subsequence feature of H1 HA is described.
Figure 12. the sequence alignment of the conservative subsequence feature of H3 HA is described.
Figure 13. the sequence alignment of the conservative subsequence feature of H5 HA is described.
The conformational map and the solvent accessibility of Figure 14 .Neu5Ac α 2-3Gal and Neu5Ac α 2-6Gal primitive.Figure A shows the conformational map of Neu5Ac α 2-3Gal binding.Through enclosing region 2 is observed anti conformation in APR34_H1_23, ADU63_H3_23 and ADS97_H5_23 eutectic structure.Through enclosing region 1 is observed conformation in AAI68_H3_23 eutectic structure.Figure B shows the conformational map of Neu5Ac α 2-6Gal, and wherein cisoid conformation (through enclosing region 3) all can be observed in the sialylated glycan eutectic of all HA-α 2-6 structure.The solvent-accessible surface that figure C is illustrated in Neu5Ac α 2-3 and α 2-6 sialylated oligosaccharides in indivedual HA-glycan eutectic structures amasss the difference between (SASA).Red and blue-green post indicate respectively α 2-6 (on the occasion of) or the sialylated glycan of α 2-3 (negative value) in Neu5Ac cause with glycan binding site more and contact.Figure D shows in conjunction with pig and people H1 (H1 α 2-3), bird and people H3 (H3 α 2-3) the sialylated glycan of α 2-3 in NeuAc SASA with combine pig and people H1 (H1 α 2-6) the sialylated glycan of α 2-6 in the SASA of NeuAc between difference.Glaucous H3 α 2-3The indication of negative value post is compared with bird H3, and people H3HA is less with contacting of Neu5Ac α 2-3Gal.Torsion angle-Φ: C2-C1-O-C3 (for Neu5Ac α 2-3/6 binding); ψ: C1-O-C3-H3 (for Neu5Ac α 2-3Gal) or C1-O-C6-C5 (for Neu5Ac α 2-6Gal); ω: O-C6-C5-H5 (for Neu5Ac α 2-6Gal) binding.Φ, ψ figure be from GlycoMaps DB ( Http:// www.glycosciences.de/modeling/glycomapsdb/) obtain, it is by Martin's Frank doctor (Dr.Martin Frank) and clo Si-Wei Ermufan Der Lieth doctor (Dr.Claus-Wilhelm von der Lieth) (German cancer research institute (German Cancer Research Institute), Germany Hai Mubao (Heidelberg, Germany)) research and development.Be respectively from the shiny red to the bright green to low-energy color rendering intent from high-energy.
Figure 15 .H1, H3 and H5HA combine related residue with the sialylated glycan of α 2-3/6.Figure A-D shows the difference (Δ of horizontal ordinate) of amassing (SASA) in ASI30_H1, APR34_H1, ADU63_H3 and the ADS97_H5 eutectic structure respectively with the solvent-accessible surface of α 2-3 and the interactional residue of the sialylated glycan of α 2-6.Green post corresponding to the residue of glycan direct interaction and light orange post corresponding to the residue that closes on Glu/Asp190 and Gln/Leu226.The Δ of green post on the occasion of showing that described residue is more with contacting of the sialylated glycan of α 2-6, and that the negative value of Δ shows is more with contacting of the sialylated glycan of α 2-3.Figure E summarizes in conjunction with the related residue of the sialylated glycan of α 2-3/6 among H1, H3 and the H5HA with form.In conjunction with some related Key residues of the sialylated glycan of α 2-3 be blue and in conjunction with related some Key residues of the sialylated glycan of α 2-6 for red.
Figure 16 .Viet04_H5HA touches combining of the sialylated glycan of α 2-6 (taper topology) with two.The three-dimensional view on surface presents the Viet04_H5 glycan binding site that utilizes being of Neu5Ac α 2-6Gal binding to prolong conformation and (obtains from pertussis toxin eutectic structure; PDB ID:1PTO).Lys193 (orange) does not have any contact with the glycan that is this conformation.In conjunction with potential other amino acid that relates to of the glycan that is this conformation is Asn186, Lys222 and Ser227.Yet viewed some contact does not exist in prolonging conformation among the HA in conjunction with the α 2-6 sialylated oligosaccharides that is cisoid conformation.Do not wish to be subjected to the constraint of any particular theory, it should be noted that this shows that prolonging conformation may be not so good as the cisoid conformation ideal with combining of HA.Wherein the Neu5Ac α 2-6Gal β 1-4GlcNAcb branch branch N that is connected to Man α 1-3Man (PDB ID:1LGC) and Man α 1-6Man (PDB ID:1ZAG) structure that is connected glycan is superimposed on the cis of Neu5Ac α 2-6Gal binding and prolongs on the Neu5Ac α 2-6Gal binding in the Viet04_H5 HA binding site of conformation.Described stack shows, the structure and the binding site of Neu5Ac α 2-6Gal β 1-4GlcNAc branch with Man α 1-6Man of the core of being connected to has disadvantageous space overlap (in two kinds of conformations).On the other hand, have the described Man α 1-3Man that is connected to core branch structure (as shown in FIG., wherein three mannose cores are purple) with cisoid conformation in Lys193 have space overlap, although but not ideal, still can not with prolong conformation in Lys193 have combination under any situation about contacting.
The generation of Figure 17 .WT H1, H3 and H5HA.Figure A is illustrated on the 4-12% SDS-polyacrylamide gel and runs glue and the trace soluble form from the HA albumen of H1N1 (the A/ South Carolina/1/1918), H3N2 (A/ Moscow/10/1999) and H5N1 (A/ Vietnam/1203/2004) on nitrocellulose filter.H1N1 HA is to use goat anti influenza A antibody and anti-goat IgG-HRP to detect.H3N2 is to use ferret anti-H3N2 HA antiserum and anti-ferret HRP to detect.H5N1 is to use anti-bird H5N1 HA antibody and anti-rabbit igg-HRP to detect.H1N1 HA and H3N2HA provide with HA0, and H5N1 HA provides with HA0 and HA1.Figure B is illustrated in and runs glue and trace total length H5N1 HA and two kinds of variants (Glu190AspLys193Ser Gly225Asp Gln226Leu, " DSDL " on nitrocellulose filter on the SDS-polyacrylamide gel; With GLu190Asp Lys193Ser Gln223LeuGly228Ser, " DSLS ").HA is to use anti-bird H5N1 antibody and anti-rabbit igg-HRP to detect.
Figure 18. go up the agglutinin dyeing of respiratory tissue section.Utilize the tracheal tissue of jackfruit agglutinin (Jacalin) (green) and ConA (ConA) (redness) to dye altogether to disclose jackfruit agglutinin (specificity is connected glycan in conjunction with O) goblet cell preferential and on the tracheae top surface to combine, and ConA (specificity connects glycan in conjunction with N) is preferentially in conjunction with the tracheal cilia epithelial cell.Do not wish to be subjected to the constraint of any particular theory, it should be noted that this discovery shows that goblet cell mainly expresses O and connect glycan, and ciliated epithelial cell is mainly expressed N and connected glycan.Utilize jackfruit agglutinin and SNA (redness; Specificity is in conjunction with α 2-6) tracheae dye altogether and show combining of SNA and goblet cell and ciliated cell.On the other hand, jackfruit agglutinin (green) and specificity show that in conjunction with the common dyeing of the MAL (redness) of the sialylated glycan of α 2-3 MAL combines very weak not even combination with a false multiple layer cilium tracheal epithelium, but with the regional broad incorporation of the bottom of described tissue.Generally speaking, the indication of agglutinin dyeing data is connected the predominant expression and the extensively distribution of the sialylated glycan of α 2-6 of a part that is connected glycan with O respectively with the N in the goblet cell as the cilium on the tracheal epithelium end face.
Figure 19. reorganization wild type and saltant HA combine with histotomy.Show wild type (WT), DSLS and DSDL and combining that tracheae, bronchus and alveolar tissue are cut into slices.For WT, white arrow is showed combine (green) that HA and alveolar tissue cut into slices.For the DSLS mutant, the white arrow of trachea and bronchus histotomy is showed this saltant HA and described combine (green) of organizing end face.It should be noted that the DSDL mutant does not combine with any histotomy.
Embodiment
Definition
Affinity: as known in the affiliated field, " affinity " is specific ligand (for example, HA polypeptide) and the measuring of the tight ness rating of its counter pair (for example, HA acceptor) combination.Affinity can be measured by different way.
Biologically active: as used herein, phrase " biologically active " is meant in biosystem and especially has the feature of active any reagent in biosome.For instance, think when throwing described biosome to be had the reagent biologically active of biological agent with biosome.In a particular embodiment, under the situation of protein or polypeptide biologically active, the total described protein or at least a bioactive protein of polypeptide or the part of polypeptide are commonly referred to " biologically active " part.
Wide spectrum is human in conjunction with (BSHB) H5 HA polypeptide: as used herein, phrase " the wide spectrum mankind are in conjunction with H5 HA " is meant in conjunction with the HA acceptor seen in human epithelium's tissue and especially in conjunction with the pattern of the H5 HA polypeptide of the human HA acceptor with the sialylated glycan of α 2-6.In addition, BSHB H5 HA of the present invention is in conjunction with the multiple different sialylated glycan of α 2-6.In certain embodiments, BSHB H5 HA is in conjunction with the sialylated glycan of different α 2-6 seen in a large amount of human sample, and the virus that contains it has infection population and especially in conjunction with the remarkable ability of upper respiratory tract acceptor in the described colony.In certain embodiments, BSHB H5 HA is in conjunction with umbrella shape glycan as described herein (for example, the sialylated glycan of long-chain alpha 2-6).
Characteristic: as used herein, " characteristic " of phrase protein or polypeptide is for containing the continuous amino acid fragment or being the part of the set of the continuous amino acid fragment of the feature of protein or polypeptide together.Each described continuous elongation will contain at least two amino acid usually.In addition, one of ordinary skill in the art will understand, and the feature of protein is needs at least 5,10,15,20 or more a plurality of amino acid usually.In general, characteristic be except that appointed sequence consistance above with the part of total at least one functional character of relevant whole protein.
Characteristic sequence: " characteristic sequence " is the sequence that sees among polypeptide or all members of nucleic acid family, and therefore one of ordinary skill in the art can use it for the member who defines described family.
The taper topology: phrase " taper topology " is used in reference to some glycan and the three-dimensional arrangement that glycan adopted on the HA acceptor especially in this article.As illustrated in fig. 6, sialylated glycan of α 2-3 or the sialylated glycan of α 2-6 all can adopt the taper topology, and it is peculiar by the short oligonucleotide chain, but some long-chain oligonucleotides also can adopt this conformation.The taper topology can be characterized by the glucosides torsion angle of Neu5Ac α 2-3Gal binding, and it extracts by Φ (C1-C2-O-C3/C6) value of pact-60,60 or 180 and-60 to 60 specified three zones (Figure 14) with least energy conformation of ψ (C2-O-C3/C6-H3/C5) sample.Fig. 8 represents some representativeness (but not exhaustive) example of the glycan that adopts the taper topology.
Corresponding to: as used herein, term " corresponding to " be generally used for representing the position/consistance of amino acid residue in the HA polypeptide.One of ordinary skill in the art will understand, for simple purpose, herein (as illustrated among for example Fig. 5 and Figure 10-13) utilize standard numbering system (based on wild type H3 HA), therefore, for example " corresponding to " amino acid of 190 residues need not to be actually the 190th amino acids in the specific amino acids chain, but corresponding to 190 being seen residues in wild type H3HA; One of ordinary skill in the art easily understand how to differentiate corresponding amino acid.
The separation degree of removing: as used herein, as the amino acid of " separation degree of removal " for to glycan in conjunction with HA amino acid with remote effect.For instance, 1 of removal degree amino acid separation can: (1) with directly combine amino acid interaction; And/or (2) otherwise influence directly the interactional ability of glycan of being associated in conjunction with amino acid and host cell HA acceptor; 1 degree amino acid separation of described removal can or can be directly in conjunction with glycan itself.The 2 degree amino acid separations of removing (1) interact with the 1 degree amino acid separation of removing; And/or (2) otherwise influence 1 degree amino acid separation of removal and directly combine the interactional ability of amino acid etc.
Directly in conjunction with amino acid: as used herein, phrase " directly in conjunction with amino acid " is meant the HA polypeptide amino acid of the glycan direct interaction that associates with one or more and host cell HA acceptor.
Through engineered: as used herein, the polypeptide that a kind of amino acid sequence has been selected by the mankind described in term " through engineered ".For instance, has the amino acid sequence that is different from the HA amino acid sequence of polypeptide seen in the natural influenza separated strain through engineered HA polypeptide.In certain embodiments, has the amino acid sequence that is different from HA amino acid sequence of polypeptide included in the ncbi database through engineered HA polypeptide.
The H1 polypeptide: when " H1 polypeptide " when being used for herein, described term is that amino acid sequence comprises that at least one is as the feature of H1 and make H1 and the HA polypeptide of other other sequential element of HA hypotype phase region.Representative described sequential element can determine by comparison such as sequence illustrated among Fig. 5 and the 10-11, and it for example comprises herein the described sequence of H1 specificity embodiment about the HA sequential element.
The H3 polypeptide: when " H3 polypeptide " when being used for herein, described term is that amino acid sequence comprises that at least one is as the feature of H3 and make H3 and the HA polypeptide of other other sequential element of HA hypotype phase region.Representative described sequential element can determine by comparison such as sequence illustrated among Fig. 5,10 and 12, and it for example comprises herein the described sequence of H3 specificity embodiment about the HA sequential element.
The H5 polypeptide: when " H5 polypeptide " when being used for herein, described term is that amino acid sequence comprises that at least one is as the feature of H5 and make H5 and the HA polypeptide of other other sequential element of HA hypotype phase region.Representative described sequential element can determine by comparison such as sequence illustrated among Fig. 5,10 and 13, and it for example comprises herein the described sequence of H5 specificity embodiment about the HA sequential element.
Hemagglutinin (HA) polypeptide: as used herein, term " hemagglutinin polypeptide " (or " HA polypeptide ") is meant that amino acid sequence comprises the polypeptide of at least one HA characteristic sequence.Known multiple HA sequence in the affiliated field from the influenza separated strain; In fact, American National biotechnology information center (National Center for BiotechnologyInformation, comprise till when NCBI) remaining into the application of the application's case 9796 HA sequences database ( Www.ncbi.nlm.nih.gov/genomes/FLU/flu.html).One of ordinary skill in the art can easily differentiate to general HA polypeptide and/or specific HA polypeptide (for example, H1, H2, H3, H4, H5, H6, H7, H8, H9, H10, H11, H12, H13, H14, H15 or H16 polypeptide) with reference to this database or mediate for example sequence of the feature of the HA of specific host infection such as bird, camel, dog, cat, civetta, environment, horse, the mankind, leopard, ermine, mouse, sea dog, stone swallow (stone martin), pig, tiger, whale.For instance, in certain embodiments, the HA polypeptide comprise one or more see about residue 97 and 185 of the HA albumen seen in the natural influenza virus separated strain, 324 and 340,96 and 100 and/or 130-230 between the characteristic sequence element.In certain embodiments, the HA polypeptide has in the HA sequential element 1 and 2 that comprises as defined herein the amino acid sequence of at least one.In certain embodiments, the HA polypeptide has the amino acid sequence that comprises HA sequential element 1 and 2, in certain embodiments, the separate about 100-200 of described element, or about 125-175, or about 125-160, or about 125-150, or about 129-139, or about 129,130,131,132,133,134,135,136,137,138 or 139 amino acid.In certain embodiments, the HA polypeptide has the amino acid sequence of the residue that comprises the position that participates in the glycan combination in regional 96-100 and/or the 130-230.For instance, many HA polypeptide comprise one or more following residue: Tyr98, Ser/Thr136, Trp153, His183 and Leu/Ile194.In certain embodiments, the HA polypeptide comprises at least 2,3,4 or whole 5 described residues.
Through separating: as used herein, term " through separating " is meant that reagent or entity (i) separate with at least some its components of being associated when producing (in nature or experimental situation) at first; Or (ii) by artificial generation.Through separation agent or entity can with at least about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or more its initial other components of associating separate.In certain embodiments, pure through separation agent for surpassing 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%.
Long-chain oligosaccharides: for the purpose of this disclosure, comprise that as FOS at least one has the straight chain of at least 4 saccharide residues, so usually it is considered as " length ".
Alpha-non-natural amino acid: phrase " alpha-non-natural amino acid " is meant to have amino acid whose chemical constitution (that is,
Figure A200780030012D00101
Therefore and can participate at least two peptide bonds but the R group is different from the entity of the being seen group of nature.In certain embodiments, alpha-non-natural amino acid also can have another non-hydrogen R group, and/or can have one or more other replacements on amino or carboxylic moiety.
Polypeptide: in general, " polypeptide " is for having at least two amino acid whose chains that connect by peptide bond each other.In certain embodiments, polypeptide can comprise 3-5 amino acid at least, and it is connected with other amino acid by means of at least one peptide bond separately.One of ordinary skill in the art will understand, and polypeptide comprises " non-natural " amino acid sometimes or still can choose other entity that is integrated in the polypeptied chain wantonly.
Pure: as used herein, if reagent or entity do not have other component in fact, it is " pure " so.For instance, will contain the preparation that surpasses about 90% particular agent or entity usually and be considered as pure preparation.In certain embodiments, reagent or entity are pure at least 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99%.
Short chain oligosaccharides:, as having less than 4 or certainly less than 3 residues in FOS what straight chain in office, so usually it is considered as " weak point " for the purpose of this disclosure.
Specificity: as known in the affiliated field, " specificity " be specific ligand (for example, the HA polypeptide) distinguishes its measuring in conjunction with the potential ability that combines counter pair (for example, bird HA acceptor) of counter pair (for example human HA acceptor, and especially human upper respiratory tract HA acceptor) and other.
Therapeutic agent: as used herein, phrase " therapeutic agent " is meant any reagent that causes required biology or pharmacological action.
Treatment: as used herein, term " treatment " is meant and is used to palliate a disease, one or more symptoms or the aspect of illness or symptom, delays any method that it shows effect, reduces its order of severity or the incidence of disease or cause its prevention.For purposes of the present invention, treatment can before the paresthesia epilepsy, during and/or throw afterwards with.
The umbrella shape topology: phrase " umbrella shape topology " is used in reference to some glycan and the three-dimensional arrangement that glycan adopted on the HA acceptor especially in this article.The present invention is contained about in conjunction with the Feature Recognition of umbrella shape topology glycan for the HA albumen of the human host infection of mediation.As illustrated in fig. 6, only the sialylated glycan of α 2-6 adopts the umbrella shape topology usually, and the umbrella shape topology is peculiar by long-chain (for example greater than tetrose) oligosaccharides.The example of umbrella shape topology is by providing (referring to for example Figure 14) for the about Φ angle of-60 Neu5Ac α 2-6Gal binding.Fig. 9 represents some representativeness (but not exhaustive) example of the glycan that adopts the umbrella shape topology.
Inoculation: as used herein, term " inoculation " is meant to throw with the expection meeting and for example virulence factor is produced immunoreactive composition.For purposes of the present invention, inoculation can be before being exposed to virulence factor, during and/or throw afterwards with, and in certain embodiments, inoculation can be before being exposed to the described factor, during and/or throw immediately afterwards with.In certain embodiments, the inoculation repeatedly throwing that comprises the inoculated composition of appropriate time at interval with.
Variant: as used herein, term " variant " is the relational language of describing the specific HA polypeptide of being paid close attention to and carrying out the relation between sequence " parent " HA polypeptide relatively.If having consistent with parent HA polypeptide, the HA polypeptide of being paid close attention to, so the HA polypeptide of being paid close attention to is considered as " variant " of parent HA polypeptide at the amino acid sequence that ad-hoc location has a small amount of sequence variation.Usually, as comparing with parent, the residue less than 20%, 15%, 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2% in the described variant is substituted.In certain embodiments, as comparing with parent, variant has 10,9,8,7,6,5,4,3,2 or 1 and is substituted residue.Variant has minute quantity (for example, less than 5,4,3,2 or 1) usually and is substituted functional residue (that is, participating in the residue of particular organisms activity).In addition, as comparing with parent, variant has usually and is no more than 5,4,3,2 or 1 and adds or disappearance, and does not have usually and add or disappearance.In addition, any interpolation or disappearance are usually less than about 25,20,19,18,17,16,15,14,13,10,9,8,7,6 residues, and usually less than about 5,4,3 or 2 residues.In certain embodiments, parent HA polypeptide is the polypeptide (for example, wild type HA) that sees in the natural influenza virus separated strain.
Carrier: as used herein, " carrier " is meant the nucleic acid molecules that can transport connected another nucleic acid.In certain embodiments, carrier can be in such as host cells such as eukaryotic or prokaryotics extrachromosomal replication and/or express connected nucleic acid.Can guide the carrier of the expression of gene of operability connection to be referred to herein as " expression vector ".
Wild type: such as in the affiliated field understanding, phrase " wild type " generally is meant as the canonical form at being seen protein of occurring in nature or nucleic acid.For instance, wild type HA polypeptide sees in the natural influenza virus separated strain.Multiple different wild type HA sequence is found in NCBI influenza virus sequence library Http: ∥ www.ncbi.nlm.nih.gov/genomes/FLU/FLU.htmlIn.
The detailed description of some specific embodiment of the present invention
Interactional definition of glycan-GBP and sign
The key protein family that is commonly referred to decorin binding protein (GBP) connects in conjunction with the N on the various glycoprotein and is connected glycan with O, and intercellular adhesion, signal transduction and shipment events in the mediation immune response.The main kind of GBP comprises C type agglutinin, galactose agglutinin (galectin) and siali acid conjugated immunoglobulin-like agglutinant (siglec).GBP is typically expressed as the monomer with a plurality of glycan binding sites or the solvable or embrane-associated protein of polymer form.In addition, GBP can be scattered on the cell surface or be positioned in the microenvironment.
Glycan binding site among the GBP is also referred to as carbohydrates recognition structure territory (CRD).CRD on the GBP holds list-tetrose glycan part primitive usually.Interaction between single CRD and the glycan primitive has the low-affinity value in μ M scope usually.Yet, most of physiological glycan-GBP interact for relate to the glycan primitive whole with the GBP multivalence interaction that formed poly CRD combines of associating.Therefore, different with the protein-protein interaction of activation or Profilin matter function (numeral regulation and control), glycan-GBP interacts and accurately adjusts (analog-modulated) protein function via affinity, classification affinity and polyvalency.
Structure-the functional relationship of decoding glycan-protein interaction in the biochemical pathway situation that produces biological function has proposed unique problem.Express because the non-template biosynthesizing of glycan relates to the coordination of multiple glycosyl transferase (wherein having some to have other tissue specificity isoform), cause so the one side of these problems is heterogeneity and Chemical Diversities by glycan.In addition, because the biosynthesizing and the celluar localization (such as a plurality of glycosylation sites on the protein) of glycan, so when from cell and separate tissue, usually glycan is considered as the heterogeneous mixture of different chemical structures.The biosynthetic non-template characteristic of glycan also makes the specific glycan structures of amplification biogenetic derivation become problem.
Obtain many progress about addressing the above problem.The important development of chemosynthesis strategy has caused synthetic hundreds of glycan structures, and it has captured the diversity of the glycan that exists on the cell surface.Use these strategies, constructed different glycan primitive form (that is, troop, dendrimers (dendromer), polymkeric substance etc.) and mated on the protein the dissimilar multivalence of glycan binding site and associate.These multivalence glycoconjugates have been mainly used in the competitive calibrating, with the relevant binding affinity of evaluating different GBP and be used to design the interactional inhibitor of physiological glycan-GBP.Although obtain these progress, about different GBP to the specificity of indivedual physiology glycan primitives or identification and understand very few by the selectivity of these biological functions of being regulated of interacting.
In order to expand rapidly about the interactional current knowledge of known specificity glycan-GBP, an international cooperating research is initiated alliance of tissue-function sugar group student's federation (Consortium for Functional Glycomics, CFG; Www.functionalglycomics.org) developed the glycan array, it comprises that several make can be at the glycan structures of novel glycan ligand specificity high throughout screening of G BP.These glycan arrays constantly obtain expanding, so that the increase of the diversity of glycan primitive, thereby can simulate the physiology diversity of glycan best.Most of glycan on the CFG array all is by chemistry and chemical enzymatic is synthetic obtains.
CFG glycan array also comprises unit price and multivalence glycan primitive (that is, being connected to the polyacrylamide main chain) and is just finding to occur about the widely used resource of the novel glycan part of its GBP that pays close attention to as glycobiology man.Except that the glycan array data, CFG is also in the state of the art resource of gene expression that research and development are used to be created in glycan biosynthetic enzyme and the GBP different pieces of information collection in the scope of full biosome sugar group (glycome) and phenotype group (phenome) analysis.
Public dissemination via the CFG data set of user-friendly interface has begun to promote to be used for to find noticeable pattern or carry out the research and development of significant data predicted digging tool by analyzing these complex data collection.Data Mining Tools is just more prevalent in genomics and proteomics field.Analyzed handling in the complex network represent biochemical pathway various ingredients (gene and protein) and its interactional high flux data, to obtain statistics significant correlation and prediction.Under the situation of sugar group, because the interactional analog feature of glycan-GBP, the therefore interaction that need exceed single glycan and single GBP is so that understand the common feature of controlling the structural entity that combines with specificity GBP.
Analyze the first step that high flux sugar group is learned the Data Mining Tools of data as being used to set up, we have taked novel method in this research, so that use the data digging method of rule-based conclusion to analyze CFG glycan array data.Utilize flexible software architecture and the Relational database of CFG, we have utilized our method to differentiate the pattern of the ability of the whole binding specificity GBP of control glycan.Use the particular instance of the different GBP of three classes family: (1) DC-SIGN and SIGNR; (2) galactose agglutinin; (3) hemagglutinin, we identify on the array AD HOC of control and the glycan of these protein interactions.We can verify by using crystal structure and by prediction GBP and do not see the pattern that the combination degree between the glycan in the glycan array is differentiated.The feasible for the first time interaction that can understand between glycan structures integral body (containing one group of common feature) and the appointment GBP of these patterns, thus allow to analyze and determine about the interactional structure-functional relationship of glycan-GBP.
Therefore, the invention provides a kind of interactional structure of glycan-GBP-emic system that is used to understand.Specifically, the invention provides the system how interaction between a kind of multivalence CRD that is used to understand the whole and GBP of glycan structures regulates basic bioprocess.The present invention differentiates the decision interactional specific glycan of appointment or its feature in conjunction with counter pair.The present invention also for example determines restriction by described feature provided according to analytical information (for example, X-ray crystallography, NMR etc.).Described restriction can be used separately, perhaps can randomly use with function information or out of Memory coupling.Appropriate functional information can for example obtain in conjunction with research from glycan.
The invention provides the computing method that are used for analyzing from the data set of glycan array (such as the glycan array by the CFG research and development) acquisition, it is used to differentiate the purpose at novel candidate's glycan part of different GBP gradually.Along with the lasting expansion of these glycan arrays, being used to analyze from these arrays data set that obtains and the value of understanding the described computing method on the interactional specific basis of glycan-GBP increases on the contrary.
For instance, use rule-based data digging method to analyze whole glycan array data (comprise high, medium and low affinity and non-binding dose), the invention provides a kind of discriminating and have actively and the novel method of the glycan pattern of negative influence being attached to GBP.An advantage of described rule-based method is to present the final pattern that can easily be used to differentiate other the potential glycan that satisfies these rules with the form of one group of rule that is easily understood.
As described herein, principle of the present invention is used for the different GBP of three classes family proves (proof-of-principle) with the principle of establishing its effect.At first example (promptly, DC-SIGN and SIGNR system) in, described rule provides three principal characters (promptly contain high mannose, Louis x[Galb4 (Fuca3) GlcNAc] and the primitive of Fuca4GlcNAc) of DC-SIGN and the only high mannose feature of DC-SIGNR.Except that the common feature that captures the combination of control high-affinity, described rule also captures in conjunction with disadvantageous feature, such as for the primitive that contains Louis x, does not exist any 3-O to replace on the Gal.These unfavorable results conform to the crystal structure analysis of DC-SIGNR, highlight the value of described method thus.
Under the situation of galactose agglutinin, described rule is comparatively complicated.Except that the high-affinity part of differentiating galactose agglutinin 1 and 3 in conjunction with the required principal character (Galb4GlcNAc), we also determine to replace described unit for the interactional effect of control with the glycan part under the situation of certain chain length.With the DC-SIGN example class seemingly, our discovery conforms to the crystal structure analysis of galactose agglutinin.According to described feature, galactose agglutinin-1 is with main difference between-3 glycan combines, the linear repetitive of galactose agglutinin-3 preference Galb4GlcNAc, but not N connects existing these unit in the different branches of glycan.Because galactose agglutinin-1 normally exists with all dimeric form with the CRD of non-covalent association, be possible so on different branches, exist Galb4GlcNAc will strengthen the combination of high-affinity multivalence.On the other hand, galactose agglutinin-3 is to have the terminal monomer that connects base region of N, and compares with these unit that occur with the branch form, and it is the linearity repetition of preference lactose amine unit probably.
Because described rule can identify in the 100% high bond (galactose agglutinin-3 and DC-SIGNR) 80% high bond (under the situation of DC-SIGNR) exactly, and it is non-false positive all in all situations, so good based on the overall accuracy of the inductive method of described rule.Although the glycan array has different glycan groups, but still can't systematically capture the overall diversity of glycan.Therefore, in garbled data, there is independent data point, i.e. the high-affinity glycan structures in determined any specific cohort not by common feature group.Described independent data point produces false negative in predicting the outcome.From table 2 and 3, can be observed each self-contained common main glycan primitive of one group of glycan of described rule with high-affinity combination.In addition, these main primitives are to unite such as not existing other primitive or chain length requirement to wait other restriction and specifying.
As the part of data mining group method, can determine based on the further feature of these main patterns and can further study of the effect of these features the glycan combination.For instance, can be defined as assessing its additional features in the position of the Galb4GlcNAc aspect existing with the distance of reducing end or non-reducing end with as the part of straight or branched to the influence that combines.In addition, combination at all modifications of each monose (such as, GalNAc, Gal[3-O-SO 3], Gal[6-O-SO 3]) the single feature of other glycan feature one-tenth capable of being combined with assess these modify in each importance to combination.
Generally speaking, use CFG glycan array data as model system, we have outline a kind of be used to differentiate the rule that complex data is concentrated or the method for pattern, and this will help its significant explanation.Many extensive sugar groups learn initial all with its resource location to obtain in the gene expression of glycan biosynthetic enzyme, GBP to the different pieces of information collection that from particular cell types and the tissue that separates from separate sources, identifies in the scope that glycan composes.Along with these data sets expand, the rule-based inductive method that can utilize this paper to outline obtains the combination of controlling gene expression with interactional pattern of glycan-GBP and biological function.
Use
The present invention allows to characterize in detail glycan-GBP binding interactions.Therefore, the invention provides with (or not with) and specify determining of the interactional glycan group of GBP.Thus, the present invention allows to prepare GBP specificity glycan array, promptly contains one group of array that is enough to establish or determines the glycan of the existence of specific GBP or status.
For instance, provide as this paper determine specific GBP glycan in conjunction with feature after, can gather and contain through in conjunction with glycan, not in conjunction with the array of glycan and/or its combination, and for example use it for the specific GBP that detects in the sample and/or characterize the GBP derivant.
Lift a particular instance, a kind of GBP that hereinafter illustrates its binding analysis is hemagglutinin (HA) H5 albumen.In general, HA is by coming in conjunction with glycoprotein receptor and the cell surface interaction.HA mainly is to be connected the glycan mediation by the N on the HA acceptor with the combination of HA acceptor.Specifically, the sialylated glycan of lip-deep HA identification of influenza virus particles and the lip-deep HA acceptor association of cell host.Identification and in conjunction with after, the host cell virocyte of eating, and a plurality of virion that will spread in the adjacent cells can be duplicated and produce to virus.
The α 2-3 or the sialylated glycan of α 2-6 of the HA binding site by contiguous HA acceptor are modified the HA acceptor, and the binding type of receptors bind glycan will influence the conformation of the HA binding site of acceptor, thereby influence the specificity of acceptor for different HA hypotypes.In addition, the inventor is definite, and the topology through connecting glycan (umbrella shape or taper) can influence the specificity of acceptor for different Ha.
For instance, the glycan binding pocket of bird HA is narrower.According to the present invention, this pocket is in conjunction with the anti conformation of the sialylated glycan of α 2-3 and/or the taper topology glycan of α 2-3 or α 2-6 connection.
HA acceptor in bird tissue and human dark lung and stomach and intestine (GI) the road tissue be characterized as the sialylated glycan binding of α 2-3, and further (according to the present invention) is characterized as the glycan of the main employing taper topology that comprises the sialylated and/or sialylated glycan of α 2-6 of α 2-3.
By contrast, the human HA acceptor in bronchus and upper respiratory tract tracheae is to modify through the sialylated glycan of α 2-6.Different with α 2-3 primitive, α 2-6 primitive has extra conformational freedom (Russell people such as (Russell), glycoconjugate magazine (Glycoconj J) 23:85,2006) because of the C6-C5 key.Have the bigger binding pocket of opening in conjunction with the HA of the sialylated glycan of described α 2-6 and hold the free caused structure diversity of conformation thus.In addition, according to the present invention, HA may need in conjunction with the glycan of umbrella shape topology (for example, the sialylated glycan of α 2-6) and especially may need in conjunction with having strong affinity and/or specific described umbrella shape topology glycan, so that regulate the infection of human upper respiratory tract tissue effectively.
Because the glycosylation of these limited space distributes, the mankind can not infect the virus that contains many wild type bird HA (for example, bird H5) usually.Specifically, because most probable is run into viral human airway part (promptly, trachea and bronchus) shortage (for example has the taper glycan, sialylated glycan of α 2-3 and/or short chain glycan) acceptor, and wild type bird HA usually mainly or only in conjunction with the taper glycan (for example, sialylated glycan of α 2-3 and/or short chain glycan) acceptor that associates, old friend's class is infected poultry virus seldom.Only fully closely contact with virus so that its can enter in dark lung and/or the intestines and stomach time, the acceptor with umbrella shape glycan (for example, the sialylated glycan of long-chain alpha 2-6) just makes the mankind be infected.
As described herein, the present invention allows to differentiate the one group of glycan that can be used for detecting H5 HA albumen and/or detect the protein variants of the binding specificity with variation that may occur.Specifically, can use described array of the present invention, utilization is in conjunction with upper respiratory tract human receptor's ability and/or utilize in conjunction with (randomly with high-affinity and/or specificity, preferably with high-affinity) ability of umbrella shape topology glycan, detect any H5 variant or in fact any HA albumen or its variant.
As described herein, described array is applicable to differentiates and/or characterizes different HA albumen and its glycan in conjunction with feature.In certain embodiments, on described array, H5 HA misfolded proteins of the present invention is tested, to evaluate it in conjunction with umbrella shape topology () ability for example, α 2-6 glycan and long-chain alpha 2-6 glycan especially, and especially evaluate its ability in conjunction with multiple described glycan.
In fact, the invention provides and can be used for characterizing HA binding capacity and/or in conjunction with the umbrella shape glycan of a kind of diagnosticum of HA (for example as test example such as the mankind, α 2-6 glycan and long-chain alpha 2-6 glycan especially) and the array of optional taper topology glycan (for example, the sialylated glycan of α 2-3).To understand that as one of ordinary skill in the art described array is not only applicable to characterize or detect H5 HA, in fact also being applicable to characterize or detect needs any HA of evaluation in conjunction with the ability of α 2-6 glycan, comprises for example H7 and/or H9.
Illustration
Example 1: data digging method
The source of the description of glycan array and glycan array data
CFG has developed two class glycan arrays: (1) is based on the microarray in hole; (2) solid phase printed array.Printed array is just developed recently, and therefore initial most of ligand screening all is to use based on the microarray in hole and carries out.First kind of pattern based on the array in hole by CFG research and development comprises about 60 kinds of different glycan, and wherein each glycan has three and repeats expression.All incorporate other glycan in ensuing each array pattern, and nearest pattern comprises 195 kinds of glycan, wherein each glycan have four repeat expression (referring to Http:// www.functionalglycomics.org/static/consortium/resources/ resourcecoreh5.shtml).Described array mainly comprises capture N is connected glycan with O the multifarious synthetic glycan of physiology.Described array also comprises the multivalence glycan part that is connected to the polyacrylamide main chain.Except that synthetic glycan, the N that derives from different mammal glycoprotein connects the glycan potpourri and also is presented on the described array.
Through selecting the data set to be used for analyzing in this research is from the CFG website Http:// www.functionalglycomics.org/glycomics/publicdata/primary screen.jspObtain.At present, filter out 40 kinds of mammal GBP at different glycan array patterns.Comprising in the designation hole strength signal of specifying GBP and GBP is poised for battle and lists the average signal of each glycan part and the garbled data of signal to noise ratio (S/N ratio) all is to use with the raw form.Should particularly point out,, do not screened so the GBP that uses more early stage array pattern to filter out does not generally re-use up-to-date pattern because the glycan array develops into its current pattern.Discriminating makes the glycan part combine another kind of other feature of GBP phase region (as discussed herein) with it in conjunction with a kind of GBP and relates to lacking of these data points.Acquisition is corresponding to the data set of the screening of DC-SIGN, DC-SIGNR, human galactose agglutinin-1 and galactose agglutinin-3 (with its indivedual carbohydrates recognition structures territory) and hemagglutinin H5.Use data mining platform hereinafter described to analyze these data sets.
Data mining platform
Related key step in the explanation data digging method among Fig. 1.These steps relate to the operation at three elements: data object, feature and sorter." data object " is for being stored in the raw data in the database.Under the situation of glycan array data, the chemical descriptor of glycan structures aspect monose and binding with and with the binding signal composition data object of the different GBP that screened.The key property of data object is " feature ".The selection that is used for the feature of data of description object makes and can obtain rule or pattern." sorter " is for being used for the rule or the pattern of the relation between particular category or the definite feature that data object is clustered to.Such as in the example hereinafter argumentation, sorter provides with the special characteristic that glycan satisfied of high-affinity in conjunction with GBP.These rules are divided into two classes: (1) is present in one group of feature on the high-affinity glycan part, thinks that it can strengthen combination; (2) should not be present in feature in the high-affinity glycan part, think that it is in conjunction with unfavorable.
Data mining platform comprises software module interact with each other (Fig. 1) and carries out operation mentioned above.Assembly is will be as the interface of CFG database so that extract the feature extractor of feature.The employed Relational database based on described object of CFG helps determining flexibly of feature.
Feature extraction and data are prepared:
As indicated above, can extract feature in conjunction with counter pair from glycan and/or its.In the illustrated application-specific of this paper, some feature is that the glycan from the glycan array extracts, and is cited as table 1:
The feature that table 1. extracts from the glycan on the glycan array.
Rule-based sorting algorithm is used the feature described in this table to differentiate and is characterized the pattern that combines with specificity GBP.
Figure A200780030012D00181
The ultimate principle of feature shown in the selection is that the glycan binding site on the GBP holds two-tetrose usually.The expression that use is set based on glycan captures the information (tree root is at reducing end) about monose in the glycan structures and binding.This representation helps the extraction of various features, comprises advanced features, such as (Fig. 2) such as monose triplet sets through connecting.Data prepare to relate to the branch bar tabulation (table 1) of the feature of each glycan that produces all glycan in the up-to-date glycan array pattern and extracted.From this master meter of glycan and its feature, select the AD HOC of a subclass at rule-based classification (vide infra) to determine that control combines with specific GBP or GBP group.
Sorter:
Developed dissimilar sorters and used it in many application.It mainly is divided into three primary categories: the mathematical approach, Furthest Neighbor and logical approach.These distinct methods with and advantage and shortcoming be discussed in Wei Si (Weiss) and Ying Dukeya (Indrukhya) (predictive data excavation-practice guideline (Predictive data mining-A practicalguide.) the root Kaufman that rubs (Morgan Kaufmann) in detail, San Francisco (San Francisco), 1998) in.For this application-specific, we select a kind of method that is called rule induction, and it belongs to logical approach.If the pattern of rule induction sorter generation-so regular (IF-THEN rule) form.
If logical approach and especially generation-a main advantage such as separation vessels such as rule induction methods of rule is so, when comparing with other statistics or the mathematical approach, the result of sorter can more easily obtain explaining.This makes can explore the rule found or the structure and the biological conspicuousness of pattern.Use the exemplary rule (referring to table 1) of the feature generation of describing in the early time to be- IfGlycan Contain" Galb4GlcNAcb3Gal[B] " and Do not contain" Fuca3GlcNAc[B] ", glycan will be with higher affinity in conjunction with galactose agglutinin 3 so.
The ad hoc rules inductive algorithm of Shi Yonging is for tieing up this (Weiss) and Ying Dukeya (Indrukhya) (predictive data excavation-practice guideline (Predictive data mining-A practical guide.) the root Kaufman that rubs (Morgan Kaufmann) in the case, San Francisco (San Francisco), 1998) algorithm of being researched and developed.
Combination degree
Determine the critical value (Fig. 3) that the differentiation low-affinity of each glycan array screening data set combines with high-affinity.
By the data mining method being applied to high flux CFG glycan array data, we have identified one group of feature in conjunction with the glycan of different GBP.Select three particular systems as an example: (1) DC-SIGN and DC-SIGNR; (2) galactose agglutinin; (3) hemagglutinin H5.Reasonably determine in these GBP families each according to glycan part preference.Because current research is according to the specific architecture basics of different ligands of glycan array data outline DC-SIGN and DC-SIGNR, first example further confirms described method.The ligand specificity of different galactose agglutinins is systematically assessed in research in the early time.Yet, CFG glycan array present the ligand specificity who is used to screen different galactose agglutinins glycan structures significantly than the macrostructure territory.Therefore, described method is applied to the galactose agglutinin data set control Else Rule that different galactose agglutinins combine with its glycan part will be provided.
Example 2: method is applied to DC-SIGN and DC-SIGNR
DC-SIGN and DC-SIGNR belong to the II type transmembrane receptor subfamily of C type agglutinin, and it is with Ca 2+The dependence mode is discerned and in conjunction with the glycan part.DC-SIGN is great expression in dendritic cell, thereby and offers dendritic cell via ICAM-3 molecule Adhesion Antigen at the T cell and cause in the immune response and play a key effect.In addition, confirm also that DC-SIGN is playing an important role aspect pathogen such as HIV by dendritic cell identification.In fact, represent, HIV combines the infection that will strengthen the T cell with DC-SIGN on the dendritic cell.On the other hand, visible and DC-SIGN has the DC-SIGNR of 77% sequence identity on the endothelial cell in liver, lymph node and placenta.
Each comfortable C end of these protein contains single carbohydrates recognition structure territory (CRD).αLuo Xuanjiegou territory, extracellular (adjacent with CRD) on these two kinds of protein helps four polymerizations of CRD, thereby causes with the multivalence of glycan part and interact.Existed a large amount of about DC-SIGN and the DC-SIGNR crystal structure information of (comprising crystal structure) with different glycan parts.Recently, use these protein of CFG glycan array screening, and confirm that it has different ligand specificities and signal transduction characteristic.Therefore, the glycan array data of these protein provides the good framework of a verification msg method for digging.
Outline in the example 1 as mentioned, analyzing corresponding glycan feature (table 1) with the glycan screening of DC-SIGN and DC-SIGNR is to extract from the CFG database.Use these features to carry out rule-based classification, wherein main objective function be each glycan with two kinds of protein in each the average signal-to-noise ratio that combines.The result who is obtained by classification is summarized in the table 2:
The rule that combines of table 2. control glycan and DC-SIGN and DC-SIGNR
Described rule is to obtain according to feature [table 1], wherein uses #[] quantity that specify to occur.The rule of last control DC-SIGN combination is comparatively complicated, relates to the “ ﹠amp of a plurality of features; " combination, this shows must satisfy each individual rule.For DC-SIGN, also show the glycan that can't be clustered to rule [false negative].
Figure A200780030012D00201
Figure A200780030012D00211
Because for DC-SIGNR, rule-based classification dopes candidate's high mannose structures of 100% and for DC-SIGN, dopes 16 kinds in 20 kinds of high bonds, so the overall performance of described classification is good.Should be specifically noted that, do not have false positive, in other words, do not have expectation meeting combination but the situation of uncombined glycan.First obvious implication of understanding from described result is that DC-SIGN combines high mannose structures with high-affinity with DC-SIGNR.According to the glycan array data, Mana3 (Mana6) Mana6Man exists for the important rule of capture with 6 kinds of different mannose glycan parts of high-affinity combination.This observations conforms to the research of in the early time crystal structure.
Except that the high mannose part, the fucosylation part that DC-SIGN characterizes by different characteristic in conjunction with another group.These fucosylation parts are not in conjunction with DC-SIGNR.Fuca4GlcNAc contains the Louis aCommon primitive in the glycan structures of [Fuca4 (Galb3) GlcNAc].Fuca3 (Galb4) GlcNAc is that N is connected another the common Louis who exists on the non-reduced end of glycan with O xPrimitive.These features all are the feature of the high-affinity bond of DC-SIGN.This observations meets the different combinations of DC-SIGN and fucosylation part, describedly is combined in previous utilization and contains the Louis xThe DC-SIGN crystal structure of glycan structures in observe.According to studying in great detail of the crystal structure of DC-SIGN that utilizes high mannose and fucosylation part and DC-SIGNR, confirm, although these protein are total similarly in conjunction with the pattern of high mannose part, and amino acid that can only pass through in the CRD of DC-SIGN different fully with the combination of fucosylation part is realized.
Another noticeable observations that described analysis provided is for lacking the special characteristic at the high-affinity combination.In other words, there are Neua3Galb4GlcNAc and Gala3Gal and Louis xPrimitive will be unfavorable with combining of DC-SIGN to these parts.Contain the Louis by utilization xThe crystal structure of glycan part research DC-SIGN determine this rule, will highlight the value of described data mining method.Because the CRD (Fig. 4) of the contiguous DC-SIGN in the 3-OH position of Gal among Fuca3 (Galb4) GlcNAc, so all will cause for any huge replacement of this position (comprise sulphation, sialylated etc.) and contact with the unfavorable space of protein and so will destroy combination.
According to containing the Louis xGlycan and the crystal structure of DC-SIGN and DC-SIGNR, the main combination of fucosylation part relates to and Ca 2+Ion forms equator oxygen (equatorial oxygen) 3-OH and the 4-OH of the Fuc of coordination.Therefore, even the Louis aUnder the situation of antigen Fuca4 (Galb3) GlcNAc, main in conjunction with 3 and the 4-OH (Fig. 4) that also relate to Fuc.What is interesting is, relate to the regular #[Fuca4GlcNAc of this primitive]〉existence of the 0 indeterminate Galb3GlcNAc of comprising binding.Therefore, described the analysis showed that and contained the Louis aThe situation of primitive is compared, and the existence of Gal is for containing the Louis xDC-SIGN influences more useful with combining of fucosylation part under the situation of primitive.
Example 3: method is applied to galactose agglutinin
Galactose agglutinin belongs to known soluble g BP family in conjunction with the beta galactose glycosides, because its activity needs reductibility mercaptan, so in the early time it is defined as S type agglutinin.(such as DC-SIGN and DC-SIGNR) is different with C type agglutinin, and the part of galactose agglutinin is in conjunction with not needing Ca 2+All relate to galactose agglutinin in the multiple biological agent (being cell development, apoptosis, cancer and immune response).Although general known galactose agglutinin is in conjunction with I type (Galb3GlcNAc) and II type (Galb4GlcNAc) lactose amine unit, its more accurate substrate specificity is still had little understanding with the related of its multiple biological agent with it.Use rule-based data mining method to analyze the data set of human galactose agglutinin-1 and-3.The CRD of these two kinds of galactose agglutinins organizes different basically.Galactose agglutinin-1 and the terminal F3 type of-3 total similar C CRD.Galactose agglutinin-1 is generally the equal dimer of CRD, and galactose agglutinin-3 comprises the single CRD with the terminal connection of N based structures territory.Hint that the N end structure territory of galactose agglutinin-3 can strengthen its affinity to the glycan part.
With above-mentioned example class seemingly, use rule-based data mining method to differentiate to strengthen and eliminate glycan part and galactose agglutinin-1 and-3 the feature (table 3) that combines on the glycan array:
Table 3. control glycan and galactose agglutinin-1 and-3 the rule that combines
Described rule is to obtain according to feature [table 1], wherein uses #[] the quantity , ﹠amp that specify to occur; The combination of the rule that hint must satisfy separately. There is not described pattern in [] hint.Among a2/3 and the a3/6/represent a2 or a3 and a3 or a6 respectively.
Figure A200780030012D00221
The rule that combines of control high-affinity part and galactose agglutinin-1 and-3 is more complicated than the rule that obtains about DC-SIGN and DC-SIGNR.Although in conjunction with II type and I type lactose amine unit, according to the critical intensity that is used to distinguish high bond, the data that obtained by the glycan array do not disclose any I type (Galb3GlcNAc) bond to known galactose agglutinin-1 and-3 with similar affinity.
Under the situation of galactose agglutinin-1, capture 8 kinds first rule (table 3) in 9 kinds of high bonds and be included in the chain length of at least 3 monose and have at least one lactose amine unit.Also should be specifically noted that, do not have false positive.According to analysis low and the high-affinity bond, show that the several modes in the rule 1 all has adverse effect to combination.The terminal fucosylation of the fucosylation of GlcNAc, Gal, the sialylated and Gala3Gal of Gal or Gala4Gal have adverse effect to combining with the existence of the combination of II type lactose amine unit.In addition, on core 2 (or core 4) O connects core, comprise II type lactose amine-the Galb4GlcNAcb6GalNAc-unit is in conjunction with having adverse effect.
Second rule proposes an interesting pattern, and it points out that comprising the II type poly lactose amine with at least two Galb4GlcNAc unit as the levulan primitive repeats, and is so sialylated to not influence of high-affinity combination.Studies show that in the early time, having terminal sialylated glycan is the candidate ligand of galactose agglutinin-1.Because the sialylated glycan that uses in this research comprises at least two Galb4GlcNAc unit, so these results conform to described rule.In addition, described rule also points out, galactose agglutinin-1 connecting inner Galb4GlcNAc unit and in the chain of non-reducing end than any other pattern of far-end for high-affinity in conjunction with not influence.Only have a kind of false negative, it comprises Gal[3-O-SO 3] b3GalNAc.
Although the rule and the galactose agglutinin-1 of galactose agglutinin-3 combination are similar, but still have some differences.These differences capture in table 4:
The comparison that table 4. galactose agglutinin-1 combines with galactose agglutinin-3
Figure A200780030012D00232
Figure A200780030012D00241
Main difference is in first rule, and what it had Mana3 (Mana6) Man unit does not exist combination with other pattern.Should particularly point out, this rule is not got rid of all N and is connected glycan.Otherwise it shows, compares with the existence of Galb4GlcNAc on the different branches of the Mana3 that is connected to core (Mana6) Man, and the galactose agglutinin-3 linear Galb4GlcNAc of preference (poly lactose amine) repeats.Another difference is, is not subjected to the inhibition of the fucosylation of Gal in the lactose amine with combining of galactose agglutinin-3, and suppressed by it with combining of galactose agglutinin-1.
With DC-SIGN and DC-SIGNR example class seemingly, the result that will be obtained by the analysis of galactose agglutinin data compares with the configuration aspects of part combination.Analyze galactose agglutinin-1 and-3 with structural composites such as different ligands such as Galb4GlcNAc, Neu5Aca3Galb4GlcNAc, Neu5Aca3Galb4 (Fuca3) GlcNAc, Neu5Aca6Galb4GlcNAc.Use respectively galactose agglutinin-1 with-3 with the crystal structure of Galb4GlcNAc part as superpose other ligand structure and make up different structural composites of framework.In the Galb4GlcNAc unit 3-OH of the 4-of Gal and 6-OH group and GlcNAc relate to the amino acid whose interaction of galactose agglutinin-1 and-3 CRD in.Therefore, the replacement at arbitrary described oxygen place all causes with the unfavorable space of protein and contacts.
The rule of the galactose agglutinin combination that is obtained by described method shows, Gala4Gal, NeuAca6Gal and Fuca3GlcNAc are in conjunction with unfavorable, and this conforms to the analysis of structural composites.Crystal structure also shows, might utilize another described unit to prolong Galb4GlcNAc (via the b3 binding) on the non-reduced side, but hint galactose agglutinin-1 and-3 connecting inner Galb4GlcNAc unit all.This has verified to have can not influence the rule of high-affinity combination such as terminal units such as Galb4 (Fuca3) GlcNAc or Neu5Aca3/6Galb4GlcNAc than long-chain.
For further checking about rule in conjunction with galactose agglutinin-1 and galactose agglutinin-3, use described rule to predict the relative combination (table 5) of non-existent two kinds of different glycan and galactose agglutinin-1 and galactose agglutinin-3 in the glycan array:
The prediction of table 5. and the relative combination of galactose agglutinin-1 and galactose agglutinin-3
Figure A200780030012D00242
Figure A200780030012D00251
As observed in the early time, the linear lactose amine of described regular prediction galactose agglutinin-3 preference repeats, and the lactose amine seen in the arrangement of galactose agglutinin-1 preference branch.This conforms in conjunction with tendency with part observed in the people (2002) such as (Hirabayashi) of Pinglin.
Example 4: method is applied to hemagglutinin
The framework (Fig. 7) that research and development H5N1 hypotype combines with the sialylated glycan of α 2-3/6.This framework comprises two complementation analysises.First analysis relate to use H1, H3 and H5 HA-glycan eutectic structural system analysis HA glycan binding site with and with the interaction (table 6) of α 2-3 and the sialylated glycan of α 2-6.
This analysis provides the interactional important knowledge of HA glycan binding site and the sialylated glycan of multiple α 2-3/6 (glycan that comprises umbrella shape or taper topology).Second analysis relates to the data digging method of analysis about the glycan array data of different H1, H3 and H5 HA.The architectural feature of glycan strong, weak and do not have and combine relevant (table 7) in this data mining analysis and different wild types and saltant HA and the microarray.
Importantly, the micro-structure variation of the sialylated binding of these correlativitys (sorter) capture α 2-3/6 and/or different topology are to the influence in conjunction with different HA.Such as hereinafter argumentation, the correlativity of the glycan feature that will obtain from data mining analysis is positioned on the HA glycan binding site, thereby the framework that combines of systematically studying H1, H3 and H5 HA and α 2-3 and the sialylated glycan of α 2-6 (glycan that comprises different topology) is provided.
Take a single example, this frame application is illustrated the chain length of α 2-6 oligonucleotide chain, especially how to become more more important than the nuance about the structural change of glycan under the situation of branch degree in H5 HA according to the present invention.For instance, and at comparing about the structural change of indivedual α 2-6 primitives, the triantennary structure with single α 2-6 primitive will influence the HA-glycan and combine with the two structures of touching with longer α 2-6 primitive.This can determine (table 7) by the different length dependence sorter of the α 2-6 primitive that is used for herein obtaining from data mining.
Framework about the binding specificity of H1, H3 and H5 HA and α 2-3 and the sialylated glycan of α 2-6
From the crystal structure of the HA of H1 (PDB ID:1RD8,1RU7,1RUY, 1RV0,1RVT, 1RVX, 1RVZ), H3 (PDBID:1MQL, 1MQM, 1MQN) and H5 (1JSN, 1JSO, 2FK0) with and provide the molecule knowledge of residue related in interacting about specificity HA-glycan with the compound of α 2-3 and/or α 2-6 sialylated oligosaccharides.Recently, comprise the glycan receptor-specific that wild type on the glycan array of multiple α 2-3 and the sialylated glycan of α 2-6 and mutant are described bird and human H1 and H3 hypotype in detail by screening.
Asp 190Glu sudden change among the HA of human epidemic virus in 1918 makes its specificity be reversed the sialylated glycan of α 2-3 (Glenn Stevens people such as (Stevens), molecular biology magazine (J.Mol.Biol), 355:1143,2006 from α 2-6; Karl-Heinz Grasser people such as (Glaser), Journal of Virology (J.Virol), 79:11533,2005).On the other hand, dual sudden change Glu190Asp and the Gly225Asp on the bird H1 (A/ duck/Alberta (Alberta)/35/1976) makes its specificity be reversed the sialylated glycan of α 2-6 from α 2-3.Under the situation of H3 hypotype, Gln226 becomes Leu and Gly228 becomes the amino acid change of Ser and relevant (the Rogers people such as (Rogers) of change of its preference from α 2-3 to the sialylated glycan of α 2-6 between bird H3N8 Strain in 1963 and the popular human H3N2 Strain of 1967-68, nature (Nature), 304:76,1983).In the ferret model, use 1918 H1N1 virus of highly pathogenic and virulence that relation (Ta Mupei T.M. (Tumpey T.M.) waits people, science (Science) 315:655,2007) between HA glycan binding specificity and the propagation efficiency is described.The receptors bind specificity is transformed into bird α 2,3 sialylated acceptor preferences (AV18) from human α 2,6 sialylated glycan (SC18) the acceptor preferences of parent can produces the virus that to propagate.On the other hand, a kind of blend alpha 2,3/ α 2,6 sialylated glycan specificity virus (A/ New York/1/18 (NY18)) do not represent propagated, be to be similarly the specific A/ of blend alpha 2,3/ α, 2,6 sialylated glycan Texas/36/91 (Tx91) virus effect spread can be arranged unexpectedly.In addition, as indicated above, various highly pathogenic H5N1 virus strains also represent blend alpha 2,3/ α 2,6 sialylated glycan specificitys (hillside plot S. people such as (Yamada S.), nature (Nature) 444:378,2006), and still can be in interpersonal propagation.Cause following problem about the sialylated glycan specificity of HA and the mixing resultant of propagation.First, whether the sialylated glycan seen in the human upper respiratory tract exists diversity, can and this explain the specificity of virus and organize taxis? second, how may existence combine the slight change of the glycan conformation that works for α 2-3 and/or the sialylated glycan of α 2-6 and HA glycan binding pocket? generally speaking, what be glycan in conjunction with the requirement of A type influenza virus HA about human adaptability?
Replace the structural limitations that is applied in conjunction with the sialylated glycan of α 2-3 for H1, H3 and H5 HA by the glycan topological sum
The analysis showed that of all HA-glycan eutectic structures, the orientation of Neu5Ac sugar (SA) are to fix with respect to HA glycan binding site.Grappling SA relates to amino acid Phe95, Ser/Thr136, Trp153, His183, the Leu/Ile194 group of high conservative in the different HA hypotypes.Therefore, HA is controlled by the interaction of HA glycan binding site and glucosides oxygen atom and the sugar except that SA to the specificity of α 2-3 or α 2-6.
The conformation of Neu5Ac α 2-3Gal binding is to make Gal and the conformation (Fig. 6) of the carbohydrate mapping except that Gal in the tapered zone of the glucosides torsion angle control that is subjected to this binding place among the α 2-3.The representative region of least energy conformation is by-60 or 60 or 180 Φ value appointment approximately, wherein ψ from-60 to 60 sampling (Figure 14).In these least energy zones, the sugar among the α 2-3 except that Gal stretches out outside the HA glycan binding site.This also can be apparent from the eutectic structure of HA and α 2-3 primitive (Neu5Ac α 2-3Gal β 1-3/4GlcNAc-), and wherein the Φ value is generally about 180 (being called anti conformation).Anti conformation makes α 2-3 primitive stretch out outside the pocket.This hint is that the center will be to HA in conjunction with having maximum effect (Fig. 7) in the structural change (sulphation and fucosylation) of Gal and/or GlcNAc (or GalNAc) sugar punishment branch with three sugar (or trisaccharide) α 2-3 primitive.This structure implication with obtain from data mining analysis about HA conform to the different sorters of three classes of the sialylated glycan combination of α 2-3 (table 7).The common characteristic of all these three classifications is not for existing Neu5Ac α 2-3Gal and GalNAc α/β 1-4Gal.Crystal structure analysis shows that the GalNAc that is connected to the Gal of Neu5Ac α 2-3Gal causes with the unfavorable space of protein and contacts, and this conforms to sorter.
Outside the conservative anchor point of desalivation acid combination, relate to two Key residues Gln226 and Glu190 in conjunction with Neu5Ac α 2-3Gal primitive.Be positioned at the glucosides oxygen atom interaction (Figure 15, figure C, D) of the Gln226 and the Neu5Ac α 2-3Gal binding of binding site bottom.Be positioned at Glu190 on the opposite side of Gln226 and Neu5Ac and Gal monose interact (Figure 15, figure C, D).In addition, in bird HA residue A la138 of high conservative (contiguous Gln226) and Gly228 (contiguous Glu190) can relate to promote for the correct conformation of sialylated glycan interactional Gln226 of the best of α 2-3 and Glu190 in (Figure 15).As viewed in crystal structure, human H1 hypotype APR34 contains all four amino acid Ala138, Glu190, Gln226 and Gly228 and in conjunction with the sialylated glycan of α 2-3.
Stack glycan binding site will provide the extra knowledge of the influence that HA is combined with the sialylated glycan of α 2-3 about Glu190 side chain location and its in the crystal structure of AAI68_H3_23, ADU67_H3_23 and APR34_H1_23.The side chain of Glu190 among the H1 HA stretches into the side chain far away (about 1 dust) of binding site than the Glu190 among the H3 HA.This is attributable to the amino acid difference of the Ser186 among Pro 186 and the H3 HA among the H1 HA of contiguous Glu190 residue.Shown in the data mining analysis of glycan microarray data, this change of Glu190 side chain conformation can with bird H1 (and non-bird H3) with appropriate affinity in conjunction with the sialylated glycan of some α 2-6 relevant (table 7).In addition, replace the interaction that Ser (the significant change between bird and the human H3 hypotype) will change the conformation of Glu190 and disturb the Neu5Ac α 2-3Gal of human H3HA and anti conformation with Gly228.This will be described in further detail by the not isomorphic map (non-anti conformation) of viewed Neu5Ac α 2-3Gal primitive in human AAI68_H3_23 eutectic structure.Neu5Ac α 2-3Gal primitive in this conformation the best with bird H3HA that described primitive provided that provided and that best contact gear ratio human H3HA binding site is anti conformation contacts few (Figure 14).Owing to this loss of contact, Gly228Ser among old friend's class H3 HA sudden change makes its glycan binding site not too help interaction with the sialylated glycan of α 2-3.This structure observation conforms to the result's (table 7) who is obtained by data mining analysis, and this shows that human H3 HA only has the affinity of appropriateness for the sialylated glycan of some α 2-3.
How does structural change about Neu5Ac α 2-3Gal influence the interaction of HA-glycan? be positioned at high conservative among the bird H5 Lys193 (Fig. 5) so that 6-O sulphation Gal among itself and the Neu5 Ac α 2-3Gal β 1-4GlcNAc and/or 6-O sulphation GlcNAc interact.Can verify this observations by data mining analysis, wherein only bird H5 is combined in Gal or the Sulfated sialylated glycan of α 2-3 in GlcNAc place (table 7) with high-affinity.222 primary amino acid can be in a similar manner with Neu5Ac α 2-3Gal β 1-3GlcNAc primitive in 4-O sulphation GlcNAc or the 6-O sulphation GlcNAc in the Neu5Ac α 2-3Gal β 1-4GlcNAc primitive interact.On the other hand, disturb fucosylation GlcNAc in Neu5Ac α 2-3Gal β 1-4 (Fuc α 1-3) the GlcNAc primitive potentially such as the Lys222 among H1 and the H5 and the huge side chains such as Trp222 among the H3.This structure observation conclusive evidence is about bird H3 and H5 Strain viewed classifier rules α 2-3C type (table 7), and this shows at GlcNAc place fucosylation in conjunction with unfavorable.Because the amino acid in Viet04_H5 HA and the ADS97_H5 HA glycan binding site separately much at one, so the association class of the two and the sialylated glycan of α 2-3 seemingly (table 7).
Therefore, in conjunction with the sialylated glycan of α 2-3, except that the residue of grappling Neu5Ac, the Glu190 of high conservative and Gln226 are for most important in conjunction with Neu5Ac α 2-3Gal primitive in all bird H1, H3 and H5 hypotype.With GlcNAc or GalNAc contact and α 2-3 primitive in relate to 137,186,187,193 and 222 amino acid such as replacements such as sulphation and fucosylations.Represent different binding specificities from the HA of H1, H3 and H5 for the sialylated glycan of different α 2-3 that exists in the glycan microarray.Amino acid residue in these positions is not conservative in different HA, and this explains different binding specificities.
Replace the structural limitations that is applied in conjunction with the sialylated glycan of a2-6 for H1 and H3 HA by the glycan topological sum
Under the situation of Neu5Ac α 2-6Gal binding, the existence of extra C6-C5 key provides the conformation dirigibility of increase.The position of Gal and sugar subsequently will be crossed over than the much bigger umbrella zone (Fig. 6) of conical region in α 2-3 situation among the α 2-6.Length on oligosaccharides is decided, and will be referred to contact with the Neu5Ac at place, described umbrella shape topology bottom and the best of Gal sugar and sugar subsequently in conjunction with α 2-6.To adopt the taper topology potentially such as short chain α 2-6 oligosaccharides such as Neu5Ac α 2-6Gal β 1-3/4Glc.On the other hand, GlcNAc among the α 2-6 primitive Neu5Ac α 2-6Gal β 1-4GlcNAc-but not the existence of Glc will be had a preference for the umbrella shape topology potentially, this can contact and be stablized by the best Van der Waals (van derWaals) between the acetyl carbon of GlcNAc and Neu5Ac.Yet α 2-6 primitive also can adopt the taper topology, so that can compensate the stability that is provided by the umbrella shape topology such as extra factors such as branch and HA combinations.Can be by the sugared interior taper topology that interacts stable as the α 2-6 primitive of the part existence of a plurality of short oligosaccharides branches in the N connection glycan.On the other hand, the preference of the α 2-6 primitive in long oligosaccharides branch (tetrose at least) umbrella shape topology.The above-mentioned viewpoint of eutectic structural support of H1 and H3 HA and α 2-6 primitive (Neu5Ac α 2-6Gal β 1-4GlcNAc-), wherein be about-60 Φ (being called cisoid conformation) and cause sugar except that Neu5Ac α 2-6Gal towards the bending of HA albumen, thus with best contact (Fig. 7) of binding site.
In H1 HA, stack will provide about the amino acid whose knowledge related to the specificity of the sialylated glycan of α 2-6 is provided from human H1N1 (the A/ South Carolina (South Carolina)/1/1918) HA of hypotype and the glycan binding structural domain of ASI30_H1_26 and APR34_H1_26.Location Lys222 and Asp225 with Neu5Ac α 2-6Gal primitive in the oxygen atom of Gal interact.Location Asp190 and Ser/Asn193 are with the extra monose GlcNAc α 1-3Gal with Neu5Ac α 2-6Gal α 1-4GlcNAc α 1-3Gal primitive interact (Figure 15, figure A, B).
In the H1 HA of human epidemic virus strain in 1918, Asp190, Lys222 and Asp225 high conservative.Although amino acid Gln226 is high conservative in all birds and human H1 hypotype, but as if comparing in conjunction with the effect in the sialylated glycan of α 2-3 (in bird H1 hypotype) with it, it does not relate in conjunction with the sialylated glycan of α 2-6 (in human H1 hypotype).Data mining analysis about the glycan array result of the wild type of bird and human H1 HA and mutant form confirms that further above-mentioned amino acid is in the effect (table 7) aspect the sialylated glycan of α 2-6.The Glu190Asp/Gly225Asp double mutant of bird H1HA makes itself and combine oppositely (table 7) of the sialylated glycan of α 2-6.In addition, the Lys222Leu mutant of human ANY18_H1 remove its with described array on the combining of all sialylated glycan, this with Lys222 glycan in conjunction with in vital role conform to.
For discriminating provides H3N2 HA specific amino acid in conjunction with the sialylated glycan of α 2-6, the glycan binding structural domain of the HA of human H3N2 (AAI68_H3), ADU63_H3_26 and ASI30_H1_26 is superposeed.For the analysis showed that of these overlaying structures, location Leu226 with provide contact with the best Van der Waals of the C6 atom of Neu5 α 2-6Gal primitive and locate Ser228 with sialic O9 interaction.Ser228 among the human H3 also interacts (different with the Gly228 that can't carry out among the described interactional bird ADU63_H3) with Glu190, thereby influences its side chain conformation.Compare with the Glu190 among the bird H3 HA, the side chain of Glu190 omits about 0.7 dust of micrometric displacement to binding site among the human H3HA.These difference limit human H3 HA in conjunction with the ability of the sialylated glycan of α 2-3 and preferentially relevant in conjunction with the sialylated glycan of α 2-6 with it.Therefore, Gln226Leu and Gly228Ser sudden change causes glycan receptor-specific reverse of bird H3 and human H3 hypotype during epidemic disease in 1967.
From the HA of the popular H3N2 of 1967-68 with from comparison shows that of the HA of (after nineteen ninety) H3 hypotype in the recent period, in hypotype in the recent period, Glu190 is mutated into Asp.Because the Asp190 of location among the human H3 to be advantageously interacting with these glycan, so this sudden change further strengthens combining of human H3 and the sialylated glycan of α 2-6.This structure implication will further be proved conclusively by the data mining analysis about the glycan array data of human H3 hypotype (A/ Moscow/10/1999).This HA comprises Asp190, Leu226 and Ser228 (Figure 10) and represents the extremely strong sialylated glycan preference of α 2-6 (table 7).
Above-mentioned observation highlights H1 and combines similarity and difference between the sialylated glycan of α 2-6 with H3 HA.In H1 and H3 HA, location Asp190 contacts (Figure 15 scheme A, B) so that can be advantageously with monose except that Neu5Ac α 2-6Gal primitive with Ser/Asn193.Amino acid between H1 and the H3 HA with and the difference that contacts with the sialylated glycan of α 2-6 be provided for different surfaces and ion complementarity in conjunction with these glycan.Neu5Ac α 2-6Gal binding has the conformational freedom of Duoing than Neu5Ac α 2-3Gal.Therefore, have bigger binding pocket opening in conjunction with the HA of the sialylated glycan of α 2-6 and hold this conformation freedom.Providing with the best Van der Waals of Neu5Ac α 2-6Gal and contact although locate Leu226 among the human H3HA, is not the best by the ion contact of the primitive therewith that Gln226 provided among the H1 HA.On the other hand, in H1, amino acid Lys222 provides than the Trp222 among the H3 with Asp225 and more contacts with the best ion of the sialylated glycan of α 2-6 with Gly225.
The structural limitations that combines with the sialylated glycan of a2-6 about wild type and saltant H5 HA
Show that by the interaction that different aminoacids provided among H1 and the H3 HA current bird H5N1 HA can be mutated into class H1 or class H3 glycan binding site so that reverse its glycan receptor-specific with the sialylated glycan of α 2-6.According to said frame, further prove conclusively the class H1 of supposition of H5 HA and class H3 sudden change and such as hereinafter argumentation tested.
Analysis for ASI30_H1_26, the APR34_H1_26, ADS97_H5_26 and the Viet04_H5 structure that superpose provides the knowledge that combines with the class H1 of the sialylated glycan of α 2-6 about H5 HA.Because H1 and H5 HA belong to the same structure clade, so the total similar topological sum amino acid of its glycan binding site distributes (Russell people such as (Russell), virology (Virology), 325:287,2004).To in bird H5 HA, locate to provide the best with Gal Neu5Ac α 2-6Gal primitive similar Lys that is similar among the H1 HA to contact by the Lys222 of high conservative.Glu190 among the Viet04_H5 does not provide the necessity with Neu5Ac α 2-6Gal β 1-4GlcNAc primitive that is similar to H1 to contact with Gly225 (substituting Asp190 and Asp225 among the H1).Therefore sudden change can improve and the contacting of the sialylated glycan of α 2-6 the Glu190Asp among the H5 HA potentially with Gly225Asp.
About the analysis showed that of the glycan binding pocket of other interaction except that GlcNAc and H1 and H5HA in the Neu5Ac α 2-6Gal β 1-4GlcNAc β 1-3Gal β 1-4Glc oligosaccharides, although the Ser/Asn193 among the H1 HA provides with the favourable of penult Gal and contacts, similar Lys193 among the H5 and GlcNAc β 1-3Gal primitive have disadvantageous space overlap.Therefore, the Lys193Ser sudden change can provide the extra favourable contact with the sialylated glycan of α 2-6 (together with Glu190Asp and Gly225As sudden change).
The Gln226 of high conservative is also conservative in bird H5HA among the H1 HA.Because Gln226 plays not too positive effect (discussing as mentioned) at H1 HA aspect the sialylated glycan of α 2-6,, amino acid mutation one-tenth such as hydrophobic amino acids such as Leu contact with the Van der Waals of C6 atom of Gal in the Neu5Ac α 2-6Gal primitive so can strengthening it potentially.
Stack ADU63_H3_26, AAI68_H3, ADS97_H5_26 and Viet04_H5 will provide the knowledge that combines with the class H3 of the sialylated glycan of α 2-6 about H5 HA.Although the glycan binding site of H5 and H3 HA is structurally aimed in this stack, it is good like that not as H5 aims at the structure between the H1.The favourable Van der Waals with Neu5 α 2-6Gal primitive that is provided with Ser228 by the Leu226 among the H3 HA respectively contacts to contact in H5 HA (having Gln226 and Gly228) with ion and does not exist.Because Leu226 and Ser228 for most important in conjunction with the sialylated glycan of α 2-6 among the human H3 HA, contact so the Gln226Leu among the H5 HA suddenlys change can provide potentially with the best of the sialylated glycan of α 2-6 with Gly228Ser.In addition, even when relatively H3 is with H5, with the Lys193 location so that with through locating to provide the Ser193 among the favourable human H3 HA that contacts to compare, Lys193 will have disadvantageous space with the monose except that Neu5Ac α 2-6Gal primitive and contact.Although the HA from the popular H3N2 of 1967-68 comprises Glu190, thereby the Asp 190 among the H5 HA will provide through the location with the preferable ion of Neu5Ac α 2-6Gal primitive in the longer oligosaccharides and contact.
The effect of above-mentioned residue will further be proved conclusively (table 7) by the data mining analysis at the glycan array data of the wild type of Viet04_H5 and mutant form.Double mutant Glu190Asp/Gly225Asp is in conjunction with any glycan structures, and this is because it loses in conjunction with the amino acid Glu190 of the sialylated glycan of α 2-3 and Lys193 for having spatial interference in conjunction with the sialylated glycan of α 2-6.Similar with double mutant, Gln226Leu/Gly228Ser is in conjunction with the sialylated glycan of some α 2-3 (α 2-3B type sorter), but also in conjunction with single two sialylated glycan of α 2-6 (α 2-6A type sorter) that touch.
About described and two touch the analysis showed that the sialylated glycan of α 2-6 combines, the Neu5Ac α 2-6Gal binding in this glycan can be potentially prolonging conformation in conjunction with double mutant, but only have less contact (Figure 16).In addition, the Neu5Ac α 2-6Gal on the Mal α 1-3Man branch is than the more advantageously combination of identical primitive on the Man α 1-6Man branch, and the identical primitive on the described Man α 1-6Man branch has disadvantageous space with the glycan binding site of H5 HA and contacts (Figure 16).The Gln226Leu/Gly228Ser double mutant is consistent with the Lys193 that disturbs combination for the narrower specificity of the sialylated glycan of α 2-6.
Do not wish to be subjected to the constraint of any particular theory, the inventor proposes the necessary condition of the human A of adaptation type influenza virus HA for to obtain with the ability of high-affinity in conjunction with long α 2-6 (mainly expressing) in people's upper respiratory tract.For instance, the multifarious aspect of glycan is the length through the lactose amine branch of sialic acid end-blocking.This captures (table 7) by two different characteristics by the sialylated glycan of the resulting α 2-6 of data mining analysis.Feature characterizes by the Neu5Ac α 2-6Gal β 1-4GlcNAc that is connected to N and connects the Man of core and thereby another feature is to characterize by being connected to the described primitive that another lactose amine unit forms longer branch (adopting the umbrella shape topology usually).Therefore, the broad incorporation of mutant H5 HA and the upper respiratory tract only these mutant with high-affinity combine adopt the umbrella shape topology have the glycan of long α 2-6 the time just possible.For instance, according to the present invention, the desired combination pattern comprises in conjunction with the umbrella shape glycan described in Fig. 9.
By contrast, we notice first about the modified H5 HA albumen recent report of (containing Gly228Ser and Gln226Leu/Gly228Ser replaces), it is showed only in conjunction with single two α 2-6 saliva acidic group-lactosaminoglycan structure (the Glenn Stevens people such as (Stevens) that touch on the glycan array, science (Science) 312:404,2006).Therefore, as described herein, described modified H5 HA albumen is not BSHB H5 HA.
Wild type and saltant H5 HA combine with the sialylated glycan of α 2-6
Therefore, the present invention's explanation, according to the interaction of human H1 or H3 HA and these glycan, current bird H5N1HA can experience and will change its specific sudden change to α 2-6 glycan.Glu190Asp, Lys193Ser, Gly225Asp and Gln226Leu sudden change (" DSDL mutant ") can make H5 HA binding site and human H1 HA binding site similar potentially, and Glu190Asp, Lys193Ser, Gln226Leu and Gly228Ser (" DSLS mutant ") can make potentially itself and human H3 HA binding site similar for the best interaction of the sialylated glycan of α 2-6.Tested according to said frame design DSDL and DSLS H5 HA mutant and to it.Such as in the early time report, in baculoviral, express wild type and saltant BSHB H5 HA and purifying (Figure 10 XXXY) in addition.
Find that the wild type of only recombinating H5 HA broad incorporation alveolar region and few (if existence), in conjunction with tracheae or bronchus, this conformed in conjunction with the sialylated glycan of α 2-3 with bird H5 HA.By contrast, only DSLS mutant (class H3) in conjunction with upper respiratory tract trachea and bronchus tissue; And in addition, this mutant is not in conjunction with dark alveolar tissue.
For tissue bond experiment,, rehydrated and cultivated 3 hours with wild type and saltant HA albumen (in PBS, diluting) with the histotomy dewaxing.According to the protein concentration of given batch behind the purifying, the suitable serial dilution of test in the 1:10-1:100 scope.After thoroughly washing, will cut into slices and block 30 minutes and subsequently itself and the anti-bird H5N1 of rabbit hemagglutinin antibody (Pu Lu scientific ﹠ technical corporation (Pro-Sci Inc), 1:1000 is in 2%BSA-PBS) were cultivated 3 hours together with 2%BSA-PBS with PBS.To cut into slices with PBS washing and subsequently with secondary goat anti-rabbit antibodies (hero Life Technologies, Inc. (Invitrogen); 1:500 is in 2%BSA-PBS) cultivated together 90 minutes.To cut into slices with propidium iodide (redness; Hero Life Technologies, Inc.; 1:200 is in PBS) counterstain and observation under Laser Scanning Confocal Microscope (LSM510 of Cai Si company type laser scanning co-focusing microscope method (Zeiss LSM510 laser scanning confocalmicroscopy)) subsequently.All cultivations all are at room temperature to carry out.
Because according to the sialylated glycan of α 2-6 in the described framework expection DSDL and the DSLS mutant broad incorporation upper respiratory tract, so about the DSLS pattern of H5 HA but not the DSDL pattern is noticeable in conjunction with the cut into slices observations of (but not in conjunction with alveolar) of trachea and bronchus.In these mutant Ser193 replace Lys193 will remove by Lys193 (in wild type H5 HA) applied sterically hindered, thereby make it have remarkable specificity to the sialylated glycan of α 2-6.In addition, because H5 and H1 belong to the same structure clade, so H5 HA will more likely be mutated into class H1 glycan binding site.
Can't this sudden change be positioned on the Viet04_H5 crystal structure in conjunction with the sialylated glycan of α 2-6 for further understanding the DSDL mutant, further with this crystal structure and ASI30_H1_26 and the stack of APR34_H1_26 crystal structure.This location shows, contacts in H1 HA and DSDL mutant conservative with all of α 2-6 sialylated oligosaccharides.Yet the Asp187 of high conservative and the Asp190 in the DSDL mutant are very approaching in bird H5 HA.The pI that the existence of 3 aspartic acids (Asp187, Asp190 and Asp225) further specifies the DSDL mutant is 6.8 (comparing with 7.3 of WT and DSLS mutant).Interaction between Asp187 and the Asp190 can change the conformation of Asp190 potentially, and this is similar among the H3 HA Ser228 to the influence of Glu190.The vicinity of the SASA of Thr187 and α 2-3 and also apparent 187 amino acids of the interactional difference of the sialylated glycan of α 2-6 is to the influence of Asp190 from ASI30_H1.Asp187 relates to Asp190 owing in H1HA, form to contact, so can destroy this interaction potentially to the influence of Asp190 with the best of the sialylated glycan of α 2-6.The Leu that the Gln226 of high conservative is mutated in the DSDL mutant among the H1 HA may influence the environment of the HA binding site of this mutant under the situation of other class H1 sudden change, and makes it not ideal with combining of the sialylated glycan of α 2-6.
Use the DS mutant of Glu190Asp/Lys193Ser or reservation Gln226 further to test the effect of Gln226 in the class H1 of H5HA combination.The DS mutant conforms to minimizing (suddenling change owing to Glu190Asp) in conjunction with the sialylated glycan of α 2-3 in conjunction with the shortage of dark lung tissue.Similarly, this mutant and the shortage of upper respiratory tract tissue bond further support Asp187 to reducing the destruction of this mutant and the Asp190 that combines of the sialylated glycan of α 2-6.Therefore, the sudden change among the current bird H5N1 HA will be partial to produce the sialylated glycan of α 2-6 will be had extensive specific class H3 (H1 compares with class) glycan binding site.
The DSLS mutant causes about the multifarious problem of the sialylated glycan of α 2-6 in the upper respiratory tract with combining of last leaf lung.The dyeing of the agglutinin of human bronchial epithelial (HBE) cell clearly illustrates that, these cells be rich in different such as N connect, O connects and the sialylated glycan of α 2-6 (Figure 18) such as glycolipid.Separating N by the cell surface from the HBE cell connects glycan and uses the MALDI-MS analysis that its sign is further proved conclusively the diversity of the sialylated glycan of these α 2-6.
Specifically, when (D.C. doctor's Ge Lete (Dr.D.C.Gruenert) the invention of 16HBE14o-cell; University of California (University of California), San Francisco (San Francisco)) reach 90% when covering with, utilize 100mM citrate normal saline buffer solution to gather about 70 * 10 6Individual cell, and at the back isolated cell film of handling and homogenize with protease inhibitors (card encyclopaedia nurse biochemical corp (Calbiochem)).Cell membrane fragments is spent the night with PNGaseF (Niu Yinglun biotech company (New England Biolabs)) processing and with reaction mixture cultivation under 37 ℃.Reaction mixture boiled 10 minutes so that enzyme deactivation, and use Sai Pu-Parker C18 SPE barrel (Sep-Pak C18 SPE cartridge) (water generation company (Waters)) to remove peptide and protein through de-glycosylation.Further desalination of glycan and use graphitized carbon solid-phase extraction column (Su Puke company (Supelco)) purifying are become neutral (25% acetonitrile wash-out part) and acid (50% acetonitrile that contains 0.05% trifluoroacetic acid) wash-out part.Down analyze acid wash-out part (contain sialylated glycan) with negative ion mode in soft ionization condition (accelerating potential 22kV, grid voltage 93%, lead 0.3% and extraction 150ns time delay) by MALDI-TOFMS.The diversity of the cracked analysis of this MALDI TOF-TOF explanation aspect N connects the branch length of branching pattern in the glycan and increase for representative mass peak.Observed longer branch length and higher branch can influence combining of H5 HA and these glycan in the glycan figure.
For instance, the multifarious aspect of glycan is the length through the lactose amine branch of sialic acid end-blocking.This captures (table 7) by two different characteristics by the sialylated glycan of the resulting α 2-6 of data mining analysis.Feature characterizes by the Neu5Ac α 2-6Gal β 1-4GlcNAc that is connected to N and connects the Man of core and thereby another feature is to characterize by being connected to the described primitive that another lactose amine unit forms longer branch.Therefore, the broad incorporation of the mutant H5 HA and the upper respiratory tract only just may when these mutant have the broad incorporation specificity to the sialylated glycan of α 2-6.For instance, according to the present invention, the desired combination pattern comprise the pattern described in Fig. 9 and/or:
Figure A200780030012D00341
With its combination:
Figure A200780030012D00342
And/or
Figure A200780030012D00343
With its combination.
By contrast, notice first about the modified H5 HA albumen recent report of (containing Gly228Ser and Gln226Leu/Gly228Ser replaces), it is showed only in conjunction with single two α 2-6 saliva acidic group-lactosaminoglycan structure (the Glenn Stevens people such as (Stevens) that touch on the glycan array, science (Science) 312:404.2006,2006).Therefore, as described herein, described modified H5 HA albumen is not BSHB H5 HA.
Equivalent
One of ordinary skill in the art only use normal experiment will recognize the many equivalents that maybe can determine specific embodiment of the present invention as herein described.Scope of the present invention does not plan to be limited to above-mentioned embodiment, but as encloses described in the claim.

Claims (12)

1. method, it comprises following steps:
Determine the feature of glycan structures;
The feature that the described warp that exists in the combination of decorin binding protein and multiple glycan and the described glycan is determined is associated.
2. method according to claim 1, wherein said associated steps comprises: the binding data of more described decorin binding protein in conjunction with the multiple glycan that contains described feature; With whether combination degree is associated with the existence of feature.
3. method according to claim 1 and 2, wherein said feature is selected from the group that is made up of following: the feature of monose aspect, advanced features, GBP are in conjunction with feature and its combination.
4. method according to claim 3, the feature of wherein said monose aspect is selected from the group that is made up of following: form, clearly form and its combination.
5. method according to claim 3, the feature of wherein said monose aspect comprise terminal the composition.
6. method according to claim 3, wherein said advanced features is selected from the group that is made up of following: to, triplet, tetrad, troop, average depth of blade, blade quantity and its combination.
7. method according to claim 6, wherein said to being selected from the group that forms by following: regular to, terminal to its combination.
8. method according to claim 6, wherein said triplet is selected from the group that is made up of following: regular, terminal, surperficial and its combination.
9. method according to claim 6, wherein said tetrad is selected from the group that is made up of following: regular, terminal, surperficial and its combination.
10. method according to claim 3, wherein said GBP is selected from the group that is made up of following in conjunction with feature: the average signal of each glycan, signal to noise ratio (S/N ratio) and its combination.
11. a method, it comprises following steps:
Determine the feature of glycan structures;
The feature that the described warp that exists in the combination of decorin binding protein and multiple glycan and the described glycan is determined is associated; With
According to described association, determine one group of glycan by the combination of described decorin binding protein institute.
12. method according to claim 11, it comprises that further preparation comprises the step of the decorin binding protein specificity glycan array of the definite glycan group of described warp.
CNA2007800300127A 2006-08-14 2007-08-14 Glycan data mining system Pending CN101501696A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US83786806P 2006-08-14 2006-08-14
US60/837,869 2006-08-14
US60/837,868 2006-08-14

Publications (1)

Publication Number Publication Date
CN101501696A true CN101501696A (en) 2009-08-05

Family

ID=40947453

Family Applications (2)

Application Number Title Priority Date Filing Date
CNA2007800300127A Pending CN101501696A (en) 2006-08-14 2007-08-14 Glycan data mining system
CNA2007800300729A Pending CN101553502A (en) 2006-08-14 2007-08-14 Hemagglutinin polypeptides, and reagents and methods relating thereto

Family Applications After (1)

Application Number Title Priority Date Filing Date
CNA2007800300729A Pending CN101553502A (en) 2006-08-14 2007-08-14 Hemagglutinin polypeptides, and reagents and methods relating thereto

Country Status (1)

Country Link
CN (2) CN101501696A (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20100111275A (en) * 2008-01-03 2010-10-14 메사츄세츠 인스티튜트 어브 테크놀로지 Decoy influenza therapies
AU2011312178B2 (en) 2010-10-04 2016-05-12 Massachusetts Institute Of Technology Hemagglutinin polypeptides, and reagents and methods relating thereto
EP2846833A4 (en) * 2012-05-10 2016-01-06 Massachusetts Inst Technology Agents for influenza neutralization
EP3918323A4 (en) * 2019-01-30 2022-12-28 TrueBinding, Inc. Anti-gal3 antibodies and uses thereof

Also Published As

Publication number Publication date
CN101553502A (en) 2009-10-07

Similar Documents

Publication Publication Date Title
JP2010501074A (en) Glycan data mining system
Gao et al. Unique binding specificities of proteins toward isomeric asparagine-linked glycans
Powers et al. Two-dimensional N-glycan distribution mapping of hepatocellular carcinoma tissues by MALDI-imaging mass spectrometry
Katrlík et al. Glycan and lectin microarrays for glycomics and medicinal applications
Smith et al. Use of glycan microarrays to explore specificity of glycan-binding proteins
Gemeiner et al. Lectinomics: II. A highway to biomedical/clinical diagnostics
JP4988889B2 (en) Polysaccharide structure and sequencing
JP4696100B2 (en) Methods for comparative analysis of carbohydrate polymers
CN101308141B (en) Method for analyzing glucoprotein
Kim et al. GlycoGrip: Cell surface-inspired universal sensor for betacoronaviruses
Cholleti et al. Automated motif discovery from glycan array data
CN103328002A (en) Hemagglutinin polypeptides, and reagents and methods relating thereto
Roderer et al. Glycan-dependent cell adhesion mechanism of Tc toxins
Klamer et al. Deciphering protein glycosylation by computational integration of on-chip profiling, glycan-array data, and mass spectrometry*[S]
CN105473745B (en) For characterizing the function of people's memebrane protein and the virion display array of interaction
JP2002544485A5 (en)
CN101501696A (en) Glycan data mining system
Ghahremani et al. Extraction and characterization of extracellular proteins and their post-translational modifications from Arabidopsis thaliana suspension cell cultures and seedlings: a critical review
Bertozzi et al. Glycomics
Dolashka et al. De novo structural determination of the oligosaccharide structure of hemocyanins from molluscs
Hiono et al. Combinatorial Approach with Mass Spectrometry and Lectin Microarray Dissected Site-Specific Glycostem and Glycoleaf Features of the Virion-Derived Spike Protein of Ancestral and γ Variant SARS-CoV-2 Strains
JP2004506874A (en) Methods and compositions for analyzing carbohydrate polymers
Joeh et al. Recent advancements in arrayed technologies and emerging themes in the identification of glycan-protein interactions
Dai et al. Mastigoneme structure reveals insights into the O-linked glycosylation code of native hydroxyproline-rich helices
Yang et al. LectoScape: A Highly Multiplexed Imaging Platform for Glycome Analysis and Biomedical Diagnosis

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Open date: 20090805