WO2023141808A1 - 用于分析g蛋白偶联受体与配体相互作用的方法和系统 - Google Patents

用于分析g蛋白偶联受体与配体相互作用的方法和系统 Download PDF

Info

Publication number
WO2023141808A1
WO2023141808A1 PCT/CN2022/073979 CN2022073979W WO2023141808A1 WO 2023141808 A1 WO2023141808 A1 WO 2023141808A1 CN 2022073979 W CN2022073979 W CN 2022073979W WO 2023141808 A1 WO2023141808 A1 WO 2023141808A1
Authority
WO
WIPO (PCT)
Prior art keywords
receptor
interaction
visualization
ligand
analysis
Prior art date
Application number
PCT/CN2022/073979
Other languages
English (en)
French (fr)
Inventor
阮陈晨
刘吉星
Original Assignee
深圳阿尔法分子科技有限责任公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳阿尔法分子科技有限责任公司 filed Critical 深圳阿尔法分子科技有限责任公司
Priority to PCT/CN2022/073979 priority Critical patent/WO2023141808A1/zh
Publication of WO2023141808A1 publication Critical patent/WO2023141808A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B45/00ICT specially adapted for bioinformatics-related data visualisation, e.g. displaying of maps or networks
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B5/00ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/30Data warehousing; Computing architectures

Definitions

  • the present invention relates to the technical field of bioinformatics, more specifically, to a method and system for analyzing the interaction between G protein-coupled receptors and ligands.
  • GPCRs G protein-coupled receptors
  • GPCRs G protein-coupled receptors
  • These receptors have a common feature that they are located in the cell membrane and the structure is connected by seven transmembrane helices by an extracellular and cellular loop.
  • GPCRs constitute the largest family of drug targets.
  • Many literatures have shown that receptors play an important role in the discovery of new drugs. And the structure of these receptors can be elucidated by crystallization, which makes it possible to understand the binding mode between the receptor and its ligand (agonist or antagonist properties), usually the binding mode is non-covalent in nature.
  • the object of the present invention is to overcome the above-mentioned drawbacks of the prior art by providing methods and systems for analyzing the interaction of G protein-coupled receptors with (antagonist or agonist) ligands.
  • a method for analyzing the interaction between a G protein-coupled receptor and a ligand includes the following steps:
  • a system for analyzing the interaction between a G protein-coupled receptor and a ligand includes:
  • Search module used to receive the receptor name or crystal structure number input by the user;
  • Data preprocessing and storage module used to collect receptor-related crystal structure information and store it in the corresponding database
  • Data processing and analysis module used to clean up PDB files and perform computational analysis through geometric and physicochemical properties to obtain analysis results reflecting the interaction between receptors and ligands;
  • Result visualization module for displaying relevant data in the form of 2D visualization and 3D visualization of the analysis results.
  • the present invention has the advantage that the interaction between the receptor and the ligand can be analyzed and visualized with almost no user intervention, just inputting the name or structure number of the receptor.
  • the platform pre-contains calculation results for each structure, and the results are presented in 2D and 3D visualizations. Therefore, users can quickly obtain results, and conduct research in various fields such as structural biology research, pharmacophore research, and virtual screening more in-depth and more conveniently.
  • the invention can be used to study the changes of the receptor pocket structure, the pharmacophore of the ligand, residues with important interaction characteristics and other functions.
  • Fig. 1 is a schematic diagram of the process of data analysis and result visualization according to an embodiment of the present invention
  • Fig. 2 is a schematic diagram of a platform home page (search module) according to an embodiment of the present invention
  • Figure 3 is a result layout for a specific receptor according to one embodiment of the present invention.
  • Fig. 4 is a visual representation of a 3D structure according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of a radar chart according to an embodiment of the present invention.
  • Fig. 6 is the chemistry of ergotamine (ergotamine), ergotamine (lysergide), methylergide new base (methyler-gonovine), lisuride (lisuride) and Ly 266097 hydrochloride (Ly 266097 hydrochloride) according to one embodiment of the present invention Structural indication;
  • Fig. 7 is a 2D and 3D visualization diagram of the interaction between a receptor and a ligand according to an embodiment of the present invention
  • Results Page-result page Globle-overall; Specific structure-specific structure; PDB structures-PDB structure; visualization-visualization; Snake plot-snake diagram; Radar chart-radar chart;
  • the present invention provides a platform or system for analyzing the interaction between G protein-coupled receptors and ligands.
  • the platform is developed in the form of a database, which can supplement existing data and ensure users get more accurate and complete results, thereby providing higher research value.
  • the platform combines computational and biochemical methods for ease of use. Users only need to look up the receptor name or PDB (crystal structure) number to get 2D and 3D visualization. Interaction analysis has been performed on the platform in advance, avoiding unnecessary waiting for calculations, therefore, users can quickly access the result display.
  • the present invention confers the visualization of receptor populations as well as individual crystal structures, the receptor population being the same receptor, same host species, same ligand type.
  • Data analysis and result generation can be generally divided into two parts: i), data mining and storage; ii), interaction fingerprint analysis.
  • Data mining and storage is processing data while interaction fingerprinting produces results and visualizes them in different forms.
  • data mining and storage includes the following steps.
  • Step i.a retrieve data from the structure.
  • This step selects structures that meet the criteria for screening.
  • Eligible criteria include, for example, the presence of small ligands, receptors in active or intermediate states, and low resolution (threshold depends on the crystallization method used), leading to the numbering of crystal structures.
  • this step simply retrieves the relevant class A GPCR structure data through the REST API address of the GPCRdb server.
  • Each structure is described by several criteria such as crystallization method, resolution of the crystal structure, refinement of the structure, composition of molecules in the structure, etc. Among these criteria, the resolution of the crystal structure is more important, at which point a threshold is set to avoid low-precision structures. In this way, structures in the platform contain only class A receptors, active or intermediate states, low-resolution GPCRs, and the presence of small ligands.
  • Step i.b) create a "receptor structure" database containing receptor name, structural sequence, receptor chain, host species, structural state, resolution, publication, name ligand and ligand type.
  • step i.a from the data obtained in step i.a), data about the crystal structure are retrieved, such as information such as receptor name, chain number, resolution, host species, references, and ligand name. Complete and fill in all the collected crystal structure information and save it to the "Acceptor Structure" database in the platform.
  • a "structure group” database is created containing the receptor name, all structure numbers of the iso-receptor, the entire sequence of the receptor, the iso-receptor host species, and the iso-ligand type.
  • the classification is a three-point criterion: same receptor, same host species, and same type of ligand. Most structures have three identical points as a cluster. The statistics for this aggregation are called triple-identity aggregation.
  • the receptors generally include three identities: the same receptor, the same host type, and the same ligand type.
  • the structure with three identical characteristics is named "three identical" clusters.
  • the different ligand types of the three identical clusters are displayed on the overall receptor results page.
  • Step i.d) create a "ligand" database containing ligand name, inchi name, iupac name, ligand chain, synonym, molecular formula, number in structure, ligand type, SMILE, molecular weight, logP, TPSA, provided The number and rotational bonds of donor and acceptor hydrogens.
  • identifying the ligand requires the use of the IUPHAR/BPS Guide to PHARMACOLOGY and the PubChem chemical molecule database to obtain all the names of the ligand and calculate the physicochemical and geometric properties of the ligand. Complete and fill in all the ligand information in the collected crystal structure and save it to the "ligand database" in the platform.
  • the structure is also cleared.
  • search for identifiers such as iupac name, inchi name
  • the physicochemical properties of the ligands such as molecular weight, logP, TPSA, number of donor and acceptor hydrogens, and rotational bonds, were mainly studied through Rdkit.
  • interaction fingerprint analysis includes the following steps.
  • Step ii.a) generating a PDB file suitable for further analysis.
  • the receptors defined by their chains are extracted from the PDB file, the ligands defined by their three-letter numbers and chains are extracted, and the water molecules of their chains (the same chain as the receptor) are extracted. These two (without water molecules) or three types of molecules will be assembled into a new file called "complex.pdb”.
  • Step ii.b) optimizing the receptor structure in complex.pdb.
  • the optimization consists of only one step: the filling of missing side-chain residues using Schrödinger's computational tools.
  • step ii.c non-covalent interactions between molecules present in the PDB file are calculated, such as the analytical calculation performed on the "complex.pdb" file.
  • the types of interactions include hydrophobic interactions, hydrogen bonds, water bridges, salt bridges, ⁇ - ⁇ stacks, ⁇ -cation interactions, halogen bonds, metal complexes.
  • the database of interactions created for each complex contains the name of the complex, the name of the interacting "residue", the number of the interacting "residue”, the atom name of the interacting "residue”, The atomic number of the interacting "residue", the x, y and z position of the atom (here "residue” means the residue of the receptor, the ligand and the water molecule), the distance, the distance2 (the one with the water bridge Condition).
  • the analysis results are presented in xml format.
  • Step ii.d for the analysis results, only water molecules participating in water bridges are kept.
  • Step ii.e) modify the PDB file prepared in step ii.c) by explicitly adding the helical data to present the 3D visualization of the Iview technique.
  • This technique requires secondary structure data to be able to identify the receptor as a protein, eg, name the newly generated file "complex_iview.pdb”.
  • step ii.f modify the PDB file prepared in step ii.c) to keep only the receptor structure, and gather three identical structures for alignment.
  • Aligned structures can be displayed on the Receptor General page.
  • the aligned structure is saved as "aligned_structure.pdb", for example.
  • step ii.g the results of the overall page of the receptors are mainly analyzed.
  • This step performs an interaction analysis on the same triple-identical aggregated structure (receptor population), designing and visualizing the interactions of all complexes in the form of a radar map.
  • the radar map generated in step ii.g) depends on the results obtained in step ii.c).
  • the radar chart is analyzed with the structure calculation results of triple-phase identical aggregation. Results for structures with agonist-like ligands are shown eg in red, while structures with antagonist-like ligands are eg shown in green. If there are two aggregates in the structure of the three-phase aggregate with the same receptor and the same host type, but different ligand types, the two aggregates will be displayed on the same overall receptor page.
  • Step ii.h performing interaction analysis on specific structures (single structures), including designing and visualizing the interaction results of complexes in the form of serpentine diagrams; designing and visualizing the interaction results of complexes in the form of two-dimensional diagrams .
  • the result of analyzing a single structural page is generated, including the following sub-steps:
  • Step ii.h.1 storing the serpentine map of the receptor and coloring the residues that interact with the ligand (according to step ii.c) in red (see Fig. 7(b) top);
  • Step ii.h.2 the interactions between receptor residues and ligands were calculated and plotted using the Poseview program (see Fig. 7(b) bottom).
  • Figure 7 is a 2D and 3D visualization of the interaction between the 5-HT2B receptor and ergotamine (PDB code: 4IB4).
  • Bottom Interaction 2D map showing the residues interacting with the ligand structure. Green lines and green word residues are hydrophobic interactions, while black dotted lines are hydrogen bonds.
  • the images visible on the platform can be downloaded in PNG or ZIP format from the results page, and the calculation result files including the receptor pocket and ligand complexes can also be obtained. See the visualization results shown in Table 1 below.
  • the provided platform can be implemented in various ways, such as using the django platform to develop, involving the development of web page tools HTML, CSS, and Javascript.
  • HTML is used to design web pages
  • Javascript is used to visualize 3D result analysis and display.
  • using the platform includes the following steps:
  • Step S1 input receptor name or structure number.
  • the platform does not cover all rhodopsin-like receptors, nor does it preserve all existing rhodopsin-like receptor structures. Valid entries are only granted if the entry exists on the platform.
  • Figure 3 is the resulting layout for specific receptors.
  • On the left is a 3D visualization of the overall view of all structures of a specific receptor and a radar map of the ligand-receptor interaction.
  • the radar plot is the type and sum of all interactions in the complex for that receptor.
  • structures are shown and indicated in cartoon and unique colors.
  • the 3D visualization is divided into two windows: the structure with the agonist ligand (not shown) is in the left window and the structure with the antagonist ligand (not shown) is in the right window.
  • Fig. 3 on the right are the results of 3D interaction visualization, serpentine plots and 2D plots (identified by their PDB numbers) of individual structures.
  • the axes displayed in the radar plot relate to the residues of the receptor interacting with its position and the ligand, and the scale used represents how many ligands in the complex interact with a particular residue.
  • the radar plot is colored according to the type of ligand in the structure, with receptor structures interacting with agonist ligands shown in red and receptor structures interacting with antagonist ligands shown in green.
  • the snake plot is obtained through the GPCRdb server and customized to show interacting residues in red. Only residues in the helix are shown, residues located in the loop are shown by lines.
  • the two-dimensional map is calculated by the program Poseview, which displays and complements ligand and residue interactions. This result has not changed.
  • Step S2 for the search data obtained in part ii), including structure files, interaction calculation files, and generated 2D visualization result files.
  • Steps S3 and S4 correspond to the recipient overall result:
  • step ii.d.2 The structure obtained in step ii.d.2) is visualized in 3D without showing the ligand.
  • the structure used is the "aligned_structure.pdb" file.
  • the radar plots of these two sets of results are displayed side by side to facilitate interaction fingerprinting and comparison.
  • Steps S5 and S6 correspond to results of specific structures, including:
  • the results are shown with 3D visualization of receptor-ligand interactions by NGL technology.
  • the technology analyzes XML documents to reveal interactions;
  • FIG. 4 See Figure 4 for a visualization of the 3D structure, rendered in cartoon form and colored in unique colors. Structures with agonist ligands (not shown) are in the left window and structures with antagonist ligands (not shown) are in the right window.
  • the structures are aligned and the transmembrane helices overlap well, though slightly offset.
  • the calculated RMSD is low, with less than 3.5 angstroms between the three identically aggregated structures.
  • Structural alignment takes into account the alignment of residues, and proteins are fluid, so the crystallized structure in the same situation may not be lower than 1 Angstrom.
  • the loop usually has a more or less different morphology, a helix at the end of the 6DRZ structure, not seen in other structures.
  • Figure 5 is the sum of weights (all), hydrophobic interactions (hydrophobicinteractions), hydrogen bonds (hydrogenbonds), water bridges (waterbridges), salt bridges (saltbridges), ⁇ - ⁇ stacks (pi-stacks), ⁇ -cation interactions ( pi-cationinteractions), halogen bonds (halogenbonds) and metal complexes (metalcomplexes) radar chart. These interactions were estimated using the PLIP technique. Interactions with agonist ligands (red) and antagonist ligands (green) are adjacent to each other. In Figure 5, the radar plot shows the frequency of interactions of residues with the complex. Valine 136 and phenylalanine 340 are the residues that interact with both ligand types at high frequency.
  • the residues will appear more in one of the ligand types, such as leucine 132, phenylalanine 341, asparagine 359 and valine 366 residues are abundant in the interaction with agonist ligands , while the alanine 255 residue is abundantly present in the interaction with the antagonist ligand.
  • the leucine 209 residue is present in the form of hydrogen bonds in the agonist ligand, whereas this residue is present in the form of hydrophobic interactions in the antagonist ligand.
  • Other interacting types also exist, but in smaller numbers.
  • the binding site is mainly of the hydrophobic type with a hydrophilic type of linkage to increase the stability of the ligand in the receptor pocket. Regardless of the type of ligand, it has a similar structure, that is, the ligand structure of the receptor should have an ergot ring, and the attached moeity can develop ligands with different activities and functions.
  • the system or platform provided by the present invention can be implemented by software or hardware, and is used to realize one or more aspects of the above solutions.
  • the overall system includes a data search module, a data preprocessing and storage module (for example performing steps i.a to i.d), a processing module, a data analysis module and a result visualization module.
  • Search module (user part): used to enter the name or structure number of the existing receptor in the platform.
  • Data Preprocessing and Storage Module Used to collect eligible crystal structures, such as receptor structures containing small ligands, low resolution, and active or intermediate states, and import data about collected structures, ligands, and structural groupings to the platform database.
  • Processing module used to clean up the PDB file, which only contains receptors, ligands, and water molecule structures participating in the "water bridge". In addition, the receptors undergo a unique structural optimization step for side chain filling. And create a new PDB file, such as "complex_iview.pdb” for Iview technology display, "aligned_structure.pdb” for structural similarity display.
  • Data analysis module used for calculation and analysis of geometric and physical and chemical properties, and the results can be stored in the form of xml files.
  • Results Visualization Module From the results of the Data Analysis Module, data is displayed in the form of 2D visualizations and 3D visualizations.
  • Design and visualize receptor-ligand interactions in the form of a radar map that includes hydrophobic interactions, hydrogen bonds, water bridges, salt bridges, ⁇ - ⁇ stacking, ⁇ -cation Interactions, halogen bonds, metal complexes, and combinations of various interactions; design and visualize interaction results in the form of serpentine diagrams; design and visualize interaction results in the form of two-dimensional interaction diagrams.
  • 3D visualization including: implementing NGL technology, reading the xml file obtained in step ii.c) and the PDB file obtained in step ii.d) for result analysis and visual display.
  • Implement the Iview technique to read the PDB file obtained in step ii.e) for visualization of stereogram structures (optional).
  • the cleaned PDB file only contains the position points of atoms, so there are differences between the PDB files used by Iview technology and NGL technology.
  • Iview technology needs to add the location of the helix.
  • the present invention uses NGL technology as the main 3D visualization. Users can display 3D visualizations of Iview technology through a single structure page.
  • the present invention has at least the following advantages:
  • the radar map presents the interaction according to the ligand type, showing as complete an analysis as possible, and customizes the serpentine map to display the interacting residues.
  • the present invention can be a system, method and/or computer program product.
  • a computer program product may include a computer-readable storage medium carrying computer-readable program instructions for causing a processor to implement various aspects of the invention.
  • a computer readable storage medium may be a tangible device that can retain and store instructions for use by an instruction execution device.
  • a computer readable storage medium may be, for example, but is not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • Computer-readable storage media include: portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM), or flash memory), static random access memory (SRAM), compact disc read only memory (CD-ROM), digital versatile disc (DVD), memory stick, floppy disk, mechanically encoded device, such as a printer with instructions stored thereon A hole card or a raised structure in a groove, and any suitable combination of the above.
  • RAM random access memory
  • ROM read-only memory
  • EPROM erasable programmable read-only memory
  • flash memory static random access memory
  • SRAM static random access memory
  • CD-ROM compact disc read only memory
  • DVD digital versatile disc
  • memory stick floppy disk
  • mechanically encoded device such as a printer with instructions stored thereon
  • a hole card or a raised structure in a groove and any suitable combination of the above.
  • computer-readable storage media are not to be construed as transient signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., pulses of light through fiber optic cables), or transmitted electrical signals.
  • Computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or downloaded to an external computer or external storage device over a network, such as the Internet, a local area network, a wide area network, and/or a wireless network.
  • the network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers.
  • a network adapter card or a network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in each computing/processing device .
  • Computer program instructions for carrying out operations of the present invention may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or Source or object code written in any combination, including object-oriented programming languages—such as Smalltalk, C++, Python, etc., and conventional procedural programming languages—such as the “C” language or similar programming languages.
  • Computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server implement.
  • the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as via the Internet using an Internet service provider). connect).
  • LAN local area network
  • WAN wide area network
  • an electronic circuit such as a programmable logic circuit, field programmable gate array (FPGA), or programmable logic array (PLA)
  • FPGA field programmable gate array
  • PDA programmable logic array
  • These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine such that when executed by the processor of the computer or other programmable data processing apparatus , producing an apparatus for realizing the functions/actions specified in one or more blocks in the flowchart and/or block diagram.
  • These computer-readable program instructions can also be stored in a computer-readable storage medium, and these instructions cause computers, programmable data processing devices and/or other devices to work in a specific way, so that the computer-readable medium storing instructions includes An article of manufacture comprising instructions for implementing various aspects of the functions/acts specified in one or more blocks in flowcharts and/or block diagrams.
  • each block in a flowchart or block diagram may represent a module, a portion of a program segment, or an instruction that includes one or more Executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented by a dedicated hardware-based system that performs the specified function or action , or may be implemented by a combination of dedicated hardware and computer instructions. It is well known to those skilled in the art that implementation by means of hardware, implementation by means of software, and implementation by a combination of software and hardware are all equivalent.

Landscapes

  • Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biophysics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioethics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Public Health (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Epidemiology (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Physiology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明公开了一种用于分析G蛋白偶联受体与配体相互作用的方法和系统。该方法包括:接收用户输入的受体名称或晶体结构编号;收集受体相关晶体结构信息,并存入对应数据库;清理PDB文件并通过几何和物理化学特性进行计算分析,获得反映受体和配体相互作用的分析结果;对所述分析结果以2D可视化和3D可视化的形式显示相关数据。本发明可提供多种相互作用的分析,根据受体的多种结构进行3D整体结构可视化展示,以及2D可视化展示,从而有利于结构生物学研究、药效团研究、虚拟筛选等各个领域的深入研究。

Description

用于分析G蛋白偶联受体与配体相互作用的方法和系统 技术领域
本发明涉及生物信息技术领域,更具体地,涉及一种用于分析G蛋白偶联受体与配体相互作用的方法和系统。
背景技术
近年来,GPCRs(G蛋白偶联受体)家族的受体在治疗应用领域被广泛研究。这些受体具有一个共同特征,它们都位于细胞膜中并且该结构由位于细胞外和细胞的环连接着七个跨膜螺旋。在不同药理特性的作用下,GPCRs构成了药物靶向的最大家族。很多文献都表明受体在新药发现中占据着重要性的地位。且这些受体的结构可以通过结晶方式来阐明,这使得了解受体与其配体(激动剂或拮抗剂性质)之间的结合模式成为可能,通常结合模式为非共价性质。通过将含有配体且来自相同受体的结构组合在一起,可以更好地了解两个分子之间的相互作用。这些相互作用揭示了配体的物理化学和几何特性、受体的激活模式和先导分子的优化。同时,使用涉及这些相互作用的分子对接和虚拟筛选在理论上会改进潜在苗头分子的发现。
然而,在现有技术中,缺乏有效地分析受体和配体相互作用的方案。这是因为,由于缺乏收集晶体结构并提供复合物之间的结合模式的计算工具,因此缺少参考物导致不少结果失败。研究者更多是利用实验室对复合物之间的联系进行自己的研究,特别在生物医学研究人员授予他们对特定目标或药物的研究方面,甚至阐明晶体结构的科学家。此外,目前还不存针对复合物相互作用模式进行直接可视化的方案。
发明内容
本发明的目的是克服上述现有技术的缺陷,提供用于分析G蛋白偶联 受体与(拮抗剂或激动剂)配体相互作用的方法和系统。
根据本发明的第一方面,提供一种用于分析G蛋白偶联受体与配体相互作用的方法。该方法包括以下步骤:
接收用户输入的受体名称或晶体结构编号;
收集受体相关晶体结构信息,并存入对应数据库;
清理PDB文件并通过几何和物理化学特性进行计算分析,获得反映受体和配体相互作用的分析结果;
对所述分析结果以2D可视化和3D可视化的形式显示相关数据。
根据本发明的第二方面,提供一种用于分析G蛋白偶联受体与配体相互作用的系统。该系统包括:
搜索模块:用于接收用户输入的受体名称或晶体结构编号;
数据预处理和存储模块:用于收集受体相关晶体结构信息,并存入对应数据库;
数据处理和分析模块:用于清理PDB文件并通过几何和物理化学特性进行计算分析,获得反映受体和配体相互作用的分析结果;
结果可视化模块:用于对所述分析结果以2D可视化和3D可视化的形式显示相关数据。
与现有技术相比,本发明的优点在于,几乎无需用户干预,只需要输入受体的名称或结构编号,即可分析和可视化受体和配体之间的相互作用。平台预先包含每个结构的计算结果,结果以2D和3D的可视化形式呈现。因此用户可以快速得到结果,以更深入,更便利的结构生物学研究、药效团研究、虚拟筛选等各个领域研究。本发明可用于研究受体口袋结构的变动、配体的药效团、具有重要相互作用特征的残基等功能。
通过以下参照附图对本发明的示例性实施例的详细描述,本发明的其它特征及其优点将会变得清楚。
附图说明
被结合在说明书中并构成说明书的一部分的附图示出了本发明的实施例,并且连同其说明一起用于解释本发明的原理。
图1是根据本发明一个实施例的数据分析和结果可视化的过程示意图;
图2是根据本发明一个实施例的平台主页(搜索模块)的示意图;
图3是根据本发明一个实施例的针对特定受体的结果布局;
图4是根据本发明一个实施例的3D结构的可视化示意;
图5是根据本发明一个实施例的雷达图示意;
图6是根据本发明一个实施的麦角胺(ergotamine)、麦角胺(lysergide)、甲基麦角新碱(methyler-gonovine)、麦角乙脲(lisuride)和Ly 266097盐酸盐(Ly 266097hydrochloride)的化学结构示意;
图7是根据本发明一个实施例受体和配体相互作用的2D和3D可视化示意;
附图中,Results Page-结果页面;Globle-总体;Specific structure-具体结构;PDB structures-PDB结构;visualization-可视化;Snake plot-蛇形图;Radar chart-雷达图;2D Diagram-二维图。
具体实施方式
现在将参照附图来详细描述本发明的各种示例性实施例。应注意到:除非另外具体说明,否则在这些实施例中阐述的部件和步骤的相对布置、数字表达式和数值不限制本发明的范围。
以下对至少一个示例性实施例的描述实际上仅仅是说明性的,决不作为对本发明及其应用或使用的任何限制。
对于相关领域普通技术人员已知的技术、方法和设备可能不作详细讨论,但在适当情况下,所述技术、方法和设备应当被视为说明书的一部分。
在这里示出和讨论的所有例子中,任何具体值应被解释为仅仅是示例性的,而不是作为限制。因此,示例性实施例的其它例子可以具有不同的值。
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步讨论。
本发明提供用于分析G蛋白偶联受体与配体相互作用的平台或称为系统。该平台以数据库的形式开发,能够补充已有的数据,保证用户得到更精确、更完整的结果,从而提供更高的研究价值。并且该平台结合了计算和生化方法,易于使用。用户只需要查找受体名称或PDB(晶体结构)编号,即可获得2D和3D可视化显示。平台已事先进行相互作用分析,避免了不必要的计算等待,因此,用户可以快速访问结果显示。此外,本发明授予受体总体以及单个晶体结构的显示,受体总体为相同受体、相同宿主种类、相同配体类型。
在下文中,将结合图1从数据分析和平台使用两个过程进行具体描述。
一、关于数据分析和结果产生
数据分析和结果产生总体上可以分为两部分:i)、数据挖掘和存储;ii)、相互作用指纹分析。数据挖掘和存储是处理数据,而相互作用指纹分析产生结果并以不同的形式可视化。
1)、数据挖掘与存储
具体地,数据挖掘和存储包括以下步骤。
步骤i.a),从结构中检索数据。
该步骤选择符合标准的结构进行筛选。例如符合要求标准包括:存在小型配体、处于活性状态或中间状态的受体和分辨率偏低(阈值取决于所使用的结晶方法),从而获得晶体结构的编号。
在一个实施例中,该步骤仅通过GPCRdb服务器的REST API地址检索有关A类GPCR结构数据。每个结构都由几个标准描述,如结晶方法、晶体结构的分辨率、结构的细化、结构中分子的组成等。在这些标准中,晶体结构的分辨率更为重要,在该条件下设置一个阈值以避免低精确度的结构。通过这种方式,平台中的结构仅包含A类受体、活性或中间状态、低分辨率的GPCR以及存在小型配体。
步骤i.b),创建一个“受体结构”数据库包含受体名称、结构序列、受体链、宿主种类、结构状态、分辨率、发表、名称配体和配体类型。
在该步骤中,根据步骤i.a)中获取的数据中,检索有关晶体结构的数据,例如受体名称、链编号、分辨率、宿主种类、参考文献和配体名称等 信息。完成填写所有收集晶体结构的信息并保存到平台中的“受体结构”数据库。
步骤i.c),创建一个“结构分组”数据库包含受体名称、同受体所有结构编号、受体的整个序列、同受体宿主物种、同配体类型。
在一个实施例中,分类为三点标准:相同的受体、相同的宿主物种和相同类型的配体。多数结构拥有着三个相同的点为一个聚集。该聚集的统计名为三相同聚集。
在本发明中,受体总体包含三相同:相同受体、相同宿主种类、相同配体类型。在本文祝好,将具备三个相同特征的结构命名为“三相同”集群。并在受体总体结果页面展示了三相同集群的不同配体类型。
步骤i.d),创建一个“配体”数据库包含配体名称、inchi名称、iupac名称、配体链、同义名、分子式、结构中的编号、配体类型、SMILE、分子量、logP、TPSA、提供体(donor)和接受体(acceptor)氢的数量和旋转键。
在该步骤中,识别配体需要利用IUPHAR/BPS Guide to PHARMACOLOGY和PubChem化学分子数据库,从而获得配体的所有名称,并计算该配体的理化和几何性质。完成填写所有收集晶体结构中的配体信息并保存到平台中的“配体数据库”。
在一个实施例中,没有配体(Unknown or Apo)的结构将被清除。
在一个实施例中,如有蛋白质或者肽为配体,该结构也被清除。
在一个实施例中,导入RDKIT化学信息学工具和化学分子数据库模块,IUPHAR/BPS Guide to PHARMACOLOGY和PubChem,包括:
寻找配体的同义名称,因为PDB文件中存在的名称与GPCRdb获得的名称不完全相同;
搜索标识符,例如iupac名称、inchi名称;
寻找结构符号,例如SMILE、化学式;
主要通过Rdkit研究配体的理化性质,例如分子量、logP、TPSA、供体和受体氢的数量以及旋转键。
2)、相互作用指纹分析
具体地,相互作用指纹分析包括以下步骤。
步骤ii.a),生成适合进一步分析的PDB文件。
在一个实施例中,PDB文件中提取其链定义的受体,提取三字母其编号及链定义的配体,提取其链(与受体相同的链)的水分子。这两种(未有水分子)或三种类型的分子将被组装到一个名为“complex.pdb”的新文件中。
步骤ii.b),优化complex.pdb中的受体结构。
例如,该优化只包含一个步骤:使用薛定谔(schrodinger)的计算工具进行填补缺失侧链的残基。
对于该步骤,无需质子化和最小化,因为PLIP(蛋白配体非共价相互作用的分析工具)程序提供质子化功能,并最小化干扰分子的排列,导致原子变动,相互作用无法正确识别。晶体结构已是最优化的状态,因此不需要加入最小化的步骤。
步骤ii.c),计算PDB文件中存在的分子之间的非共价相互作用,如对“complex.pdb”文件进行分析计算。
在一个实施例中,相互作用的类型包括疏水相互作用、氢键、水桥、盐桥、π-π堆栈、π-阳离子相互作用、卤素键、金属配合物。
在一个实施例中,为每个复合物创建相互作用的数据库包含复合物的名称,相互作用“残基”的名称,相互作用“残基”的号码,相互作用“残基”的原子名称,相互作用“残基”的原子号码,原子的x、y和z位置(此处“残基”的意思包含受体的残基、配体以及水分子)、距离、距离2(有水桥的情况)。
在一个实施例中,将分析结果以xml格式呈现。
步骤ii.d),对于分析结果,仅保留参与水桥键的水分子。
步骤ii.e),修改步骤ii.c)中准备的PDB文件,通过显式添加螺旋数据以呈现Iview技术的3D可视化。
该技术需要二级结构数据才能够辨识受体为蛋白质,例如,将新产生的文件命名为“complex_iview.pdb”。
步骤ii.f),修改步骤ii.c)中准备的PDB文件,只保留受体结构,并聚 集三相同结构进行对齐。
该步骤显示三相同聚集中的结构相似性。可将对齐的结构在受体总体页面展示。该对齐后的结构例如保存为“aligned_structure.pdb”。
步骤ii.g),主要分析受体总体页面的结果。
该步骤对同一个三相同聚集的结构(受体总体)进行相互作用分析,以雷达图的形式设计和可视化所有复合物的相互作用。
具体地,步骤ii.g)生成的雷达图取决于步骤ii.c)中获得的结果。该雷达图以三相同一聚集的结构计算结果分析。具有激动剂样配体的结构的结果例如显示为红色,而具有拮抗剂样配体的结构例如显示为绿色。若三相同一聚集的结构中有两聚集拥有相同的受体和相同的宿主种类,但不同的配体类型,该两聚集在同一个受体总体页面展示。
步骤ii.h),对具体结构(单个结构)进行相互作用分析,包括以蛇形图的形式设计和可视化复合物的相互作用结果;以二维图的形式设计和可视化复合物的相互作用结果。
该步骤中,生成分析单个结构页面的结果,包括以下子步骤:
步骤ii.h.1),存储受体的蛇形图,并将与配体相互作用的残基(根据步骤ii.c)涂成红色(参见图7(b)上);
步骤ii.h.2),使用Poseview程序计算并绘制受体残基和配体之间的相互作用(参见图7(b)下)。
图7是5-HT2B受体和麦角胺(PDB编号:4IB4)相互作用的2D和3D可视化。其中7(a)复合物的3D相互作用可视化图7(b)顶部:受体的蛇形图,显示跨膜残基,相互作用的跨膜残基以红色显示。底部:相互作用二维图显示与配体结构相互作用的残基。绿线和绿字残基是疏水相互作用,而黑点线是氢键。
平台可见的图像可通过结果页面中下载PNG或ZIP格式获得,此外还可以获取包含受体口袋与配体复合物的计算结果文件。参见下表1所示的可视化结果。
表1:可视化结果
  受体的所有结构 单个结构的受体
2D可视化 雷达图 蛇身图+二维图
2D下载数据
NGL(3D可视化)
Iview(3D可视化)
二、关于平台使用
所提供的平台可采用多种方式实现,如利用django平台开发,涉及到开发网页工具HTML、CSS、Javascript。其中CSS用于设计网页,Javascript用于可视化3D结果分析和显示。
具体地,结合图2所示,平台使用包括以下步骤:
步骤S1,输入受体名称或结构编号。然而,平台并未涵盖所有视紫质样受体,也并非保留现有的所有视紫质样受体结构。只有当条目存在于平台时,才会授予有效条目。
应注意的是,输入结构的编号相当于查找其受体,因此用户会到达受体总体的结果页面。如果用户想要分析具体的结构,可通过点击结果页面中的具体结构编号进入该结构的结果页面,如图3所示。
图3是对特定受体的结果布局。左侧是特定受体所有结构的3D可视化整体视野以及配体-受体相互作用的雷达图展示结果。雷达图是所有该受体的复合物中相互作用类型以及总和。在3D可视化中,结构以卡通和独有的颜色显示和表明。3D可视化分为两窗户:具有激动剂配体(未显示)的结构在左窗口中,而具有拮抗剂配体(未显示)的结构在右窗口中。在图3中,右侧是单个结构的3D相互作用可视化、蛇形图和二维图(由其PDB编号识别)的结果。
在一个实施例中,雷达图中显示的轴涉及受体的残基与其位置和配体进行相互作用,所使用的刻度代表多少复合物中的配体与残基进行特定的相互作用。雷达图根据结构中的配体类型进行上色,与激动剂配体相互作用的受体结构以红色显示,而与拮抗剂配体相互作用的受体结构以绿色显示。
在一个实施例中,蛇形图是通过GPCRdb服务器获得,并定制为红色 显示相互作用的残基。仅展现螺旋中的残基,位于环中的残基由线条显示。
在一个实施例中,二维图由Poseview程序计算,该程序显示并补充了配体和残基的相互作用。该结果并未有所改动。
步骤S2,为在部分ii)中获得的搜索数据,包括结构文件、相互作用计算文件、生成的2D可视化结果文件。
步骤S3和S4对应于受体总体结果:
步骤ii.d.2)获得的结构以3D可视化展现,但不显示配体。该使用的结构是“aligned_structure.pdb”的文件。
显示雷达图:如果该受体,拥有相同的宿主种类,但也拥有结合激动剂以及拮抗剂的结果。这两组结果的雷达图并排显示,以方便进行相互作用指纹分析和对比。
步骤S5和S6对应于特定结构的结果,包括:
通过NGL技术显示具有受体-配体相互作用的3D可视化的结果。该技术分析XML文件以展现相互作用;
通过Iview显示结构的3D可视化浮雕;
显示蛇形图和二维图表。
以下分析仅涉及人类物种的5HT2B受体。
参见图4所示的3D结构的可视化,结构以卡通形式呈现,并以独有的颜色着色。具有激动剂配体(未显示)的结构在左窗口中,而具有拮抗剂配体(未显示)的结构在右窗口中。在图4中,结构对齐,跨膜螺旋重叠良好,不过稍有偏移。计算出的RMSD很低,在该三相同聚集的结构之间的差距小于3.5埃。结构对齐考虑了残基的对齐,而蛋白质是具有流动性的,因此相同情况结晶的结构也未必低于1埃。环通常具有或多或少不同的形态,即6DRZ结构末端的螺旋,未在其他结构可见。
图5是权总和(all)、疏水相互作用(hydrophobicinteractions)、氢键(hydrogenbonds)、水桥(waterbridges)、盐桥(saltbridges)、π-π堆栈(pi-stacks)、π-阳离子相互作用(pi-cationinteractions)、卤素键(halogenbonds)和金属配合物(metalcomplexes)的雷达图。这些相互作用是利用PLIP技术估计的。与激动剂配体(红色)和拮抗剂配体(绿色) 的相互作用彼此相邻。在图5中,雷达图显示了残基与复合物相互作用的频率。与两种配体的类型进行高频率相互作用的残基有缬氨酸136和苯丙氨酸340。其中残基会在其中一个配体类型中出现较多,例如亮氨酸132、苯丙氨酸341、天冬酰胺359和缬氨酸366残基大量存在于和激动剂配体的相互作用中,而丙氨酸255残基大量存在于和拮抗剂配体的相互作用中。亮氨酸209残基在激动剂配体中以氢键的形式存在,而该残基在拮抗剂配体中以疏水相互作用的形式存在。其他相互作用的类型也存在,但数量较少。
在图7中,显示了5HT2B受体与麦角胺配体(呈现在图6左上角)的相互作用。许多疏水相互作用(如设置为红色)以及一个水桥键(如设置为蓝色)存在,其中唯独存在于水桥键中的水分子保留在PDB文件中。相互作用的残基位于螺旋3、5、6和7中,它们对调节受体的结构以激活G蛋白很重要。此外,受体口袋中的残基主要是疏水的,这表明受体口袋存在于受体的内部。
这些数据表明正构结合位点存在于受体的内部(图5和图6)。结合位点主要是疏水类型的,带有亲水类型的键,以增加受体口袋中配体的稳定性。配体无论类型如何,都具有相似的结构,即该受体的配体结构应有麦角碱环,附着的片段(moeity)能够研发不同活性和功能的配体。
本发明提供的系统或平台可采用软件或硬件实现,用于实现上述方案的一个方面或多个方面。例如整个系统包括数据搜索模块、数据预处理和存储模块(例如执行步骤i.a至i.d)、处理模块、数据分析模块以及结果可视化模块。
搜索模块(用户部分):用于在平台输入存有的受体名称或结构编号。
数据预处理和存储模块:用于收集符合条件的晶体结构,例如包含小型配体、低分辨率以及活性或中间态的受体结构,并将有关收集的结构、配体和结构分组的数据输入到平台的数据库。
处理模块:用于清理PDB文件,该文件仅包含受体、配体和参与“水桥”键水分子结构。此外,受体经过侧链填充唯一结构优化步骤。并建立新的PDB文件,如“complex_iview.pdb”用于Iview技术展示, “aligned_structure.pdb”用于结构相似性展示。
数据分析模块:用于通过几何和物理化学特性的计算分析,可将结果采用xml文件的形式存储。
结果可视化模块(用户部分):从数据分析模块的结果中,以2D可视化和3D可视化的形式显示数据。
对于2D可视化,包括:以雷达图的形式设计和可视化受体与配体之间的相互作用,该雷达图包括疏水相互作用、氢键、水桥、盐桥、π-π堆叠、π-cation相互作用、卤素键、金属配合物和各类相互作用的结合;以蛇形图的形式设计和可视化相互作用结果;以二维交互图的形式设计和可视化相互作用结果。
对于3D可视化,包括:实施NGL技术,读取在步骤ii.c)获得的xml文件和步骤ii.d)获得的PDB文件进行结果分析和可视化显示。实施Iview技术,读取在步骤ii.e)获得的PDB文件进行立体图结构的可视化(可选)。
应注意的是,清理过后的PDB文件只包含原子的位置点,因此Iview技术的与NGL技术所使用PDB文件有相差之处。Iview技术需要添加螺旋所在位置。本发明以NGL技术为主要3D可视化。用户可通过单个结构页面展示Iview技术的3D可视化。
综上所述,相对于现有技术,本发明至少具有以下优势:
1)、对于文件的清理,使其只包含复合物分析所必需的分子。同时,为了不干扰相互作用分析的文件,专门用于对齐结构文件和Iview使用的结构文件以其他的名称保存。
2)、不需要用户提供任何文件,结果的可视化直接链接到平台中的计算和生成的数据。
3)、提供了2D可视化结果,雷达图根据配体类型呈现相互作用,展现出尽可能完整的分析,定制蛇形图以显示相互作用的残基。
4)、提出用两个不同的3D可视化技术,从而在视觉上更完整地识别分子间相互作用。
本发明可以是系统、方法和/或计算机程序产品。计算机程序产品可以包括计算机可读存储介质,其上载有用于使处理器实现本发明的各个方面 的计算机可读程序指令。
计算机可读存储介质可以是可以保持和存储由指令执行设备使用的指令的有形设备。计算机可读存储介质例如可以是但不限于电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任意合适的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、静态随机存取存储器(SRAM)、便携式压缩盘只读存储器(CD-ROM)、数字多功能盘(DVD)、记忆棒、软盘、机械编码设备、例如其上存储有指令的打孔卡或凹槽内凸起结构、以及上述的任意合适的组合。这里所使用的计算机可读存储介质不被解释为瞬时信号本身,诸如无线电波或者其他自由传播的电磁波、通过波导或其他传输媒介传播的电磁波(例如,通过光纤电缆的光脉冲)、或者通过电线传输的电信号。
这里所描述的计算机可读程序指令可以从计算机可读存储介质下载到各个计算/处理设备,或者通过网络、例如因特网、局域网、广域网和/或无线网下载到外部计算机或外部存储设备。网络可以包括铜传输电缆、光纤传输、无线传输、路由器、防火墙、交换机、网关计算机和/或边缘服务器。每个计算/处理设备中的网络适配卡或者网络接口从网络接收计算机可读程序指令,并转发该计算机可读程序指令,以供存储在各个计算/处理设备中的计算机可读存储介质中。
用于执行本发明操作的计算机程序指令可以是汇编指令、指令集架构(ISA)指令、机器指令、机器相关指令、微代码、固件指令、状态设置数据、或者以一种或多种编程语言的任意组合编写的源代码或目标代码,所述编程语言包括面向对象的编程语言—诸如Smalltalk、C++、Python等,以及常规的过程式编程语言—诸如“C”语言或类似的编程语言。计算机可读程序指令可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络—包括局域网(LAN)或广域网 (WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。在一些实施例中,通过利用计算机可读程序指令的状态信息来个性化定制电子电路,例如可编程逻辑电路、现场可编程门阵列(FPGA)或可编程逻辑阵列(PLA),该电子电路可以执行计算机可读程序指令,从而实现本发明的各个方面。
这里参照根据本发明实施例的方法、装置(系统)和计算机程序产品的流程图和/或框图描述了本发明的各个方面。应当理解,流程图和/或框图的每个方框以及流程图和/或框图中各方框的组合,都可以由计算机可读程序指令实现。
这些计算机可读程序指令可以提供给通用计算机、专用计算机或其它可编程数据处理装置的处理器,从而生产出一种机器,使得这些指令在通过计算机或其它可编程数据处理装置的处理器执行时,产生了实现流程图和/或框图中的一个或多个方框中规定的功能/动作的装置。也可以把这些计算机可读程序指令存储在计算机可读存储介质中,这些指令使得计算机、可编程数据处理装置和/或其他设备以特定方式工作,从而,存储有指令的计算机可读介质则包括一个制造品,其包括实现流程图和/或框图中的一个或多个方框中规定的功能/动作的各个方面的指令。
也可以把计算机可读程序指令加载到计算机、其它可编程数据处理装置、或其它设备上,使得在计算机、其它可编程数据处理装置或其它设备上执行一系列操作步骤,以产生计算机实现的过程,从而使得在计算机、其它可编程数据处理装置、或其它设备上执行的指令实现流程图和/或框图中的一个或多个方框中规定的功能/动作。
附图中的流程图和框图显示了根据本发明的多个实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或指令的一部分,所述模块、程序段或指令的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。 也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或动作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。对于本领域技术人员来说公知的是,通过硬件方式实现、通过软件方式实现以及通过软件和硬件结合的方式实现都是等价的。
以上已经描述了本发明的各实施例,上述说明是示例性的,并非穷尽性的,并且也不限于所披露的各实施例。在不偏离所说明的各实施例的范围和精神的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。本文中所用术语的选择,旨在最好地解释各实施例的原理、实际应用或对市场中的技术改进,或者使本技术领域的其它普通技术人员能理解本文披露的各实施例。本发明的范围由所附权利要求来限定。

Claims (10)

  1. 一种用于分析G蛋白偶联受体与配体相互作用的方法,包括以下步骤:
    接收用户输入的受体名称或晶体结构编号;
    收集受体相关晶体结构信息,并存入对应数据库;
    清理PDB文件并通过几何和物理化学特性进行计算分析,获得反映受体和配体相互作用的分析结果;
    对所述分析结果以2D可视化和3D可视化的形式显示相关数据。
  2. 根据权利要求1所述的方法,其特征在于,所述收集受体相关晶体结构信息,并存入对应数据库包括:
    从外部服务器选择标准的结构进行筛选,获取数据基础以及晶体结构编号,其中该标准包括A类GPCRs受体、结构分辨率、活性状态或中间状态受体、存有小型配体;
    针对获取的数据基础,检索有关晶体结构的数据,并将收集的晶体结构信息存储在受体结构数据库、结构分组数据库、配体数据库以及所使用的配体工具,其中受体结构数据库存储的信息包括受体名称、链编号、分辨率、宿主种类、参考文献和配体名称;结构分组数据库设计为将相同受体、相同宿主种类、相同配体类型的结构组合在一起,称为三相同聚集;配体数据库存储的信息包括配体的名称以及配体的理化和几何性质。
  3. 根据权利要求2所述的方法,其特征在于,所述清理PDB文件并通过几何和物理化学特性进行计算分析包括:
    生成仅包含受体、配体和水分子的第一PDB文件;
    通过填充残基侧链优化受体;
    根据第一PDB文件并经过所述优化受体,通过几何和物理化学特性进行计算分析,并将分析结果采用xml文件格式存储。
  4. 根据权利要求3所述的方法,其特征在于,对所述分析结果以2D可视化和3D可视化的形式显示相关数据包括:
    根据所述分析结果,仅保留涉及“水桥”键中的水分子来处理PDB文件;
    对于经处理的PDB文件中,添加有关受体螺旋的信息并将其转换为第二PDB文件,以进行3D可视化;
    对于经处理的PDB文件中,去除受体以外的分子,并聚集三相同的结构进行对齐,将该对齐结构保存为第三PDB文件,以显示三相同聚集中的结构相似性;
    对同一个三相同聚集的结构进行相互作用分析,以雷达图的形式设计和可视化所有复合物的相互作用;
    对单个结构进行相互作用分析,包括蛇形图的形式设计和可视化复合物的相互作用结果,以及二维图的形式设计和可视化复合物的相互作用结果。
  5. 根据权利要求4所述的方法,其特征在于,所述雷达图有九个,分别反映疏水相互作用、氢键、水桥、盐桥、π-π堆叠、π-阳离子相互作用、卤素键、金属配合物相互作用以及权总和。
  6. 根据权利要求4所述的方法,其特征在于,复合物的相互作用类型选自疏水相互作用、氢键、水桥、盐桥、π-π堆叠、π-阳离子相互作用、卤素键、金属配合物相互作用。
  7. 根据权利要求4所述的方法,其特征在于,对于单个结构数据,通过受体总体页面进行3D可视化和2D可视化。
  8. 根据权利要求4所述的方法,其特征在于,对于所述蛇形图,使用计算相互作用结果的文件进行蛇形图修改,参与相互作用的残基涂改为红色,且该修改只限于螺旋中的残基。
  9. 一种用于分析G蛋白偶联受体与配体相互作用的系统,包括:
    搜索模块:用于接收用户输入的受体名称或晶体结构编号;
    数据预处理和存储模块:用于收集受体相关晶体结构信息,并存入对应数据库;
    数据处理和分析模块:用于清理PDB文件并通过几何和物理化学特性进行计算分析,获得反映受体和配体相互作用的分析结果;
    结果可视化模块:用于对所述分析结果以2D可视化和3D可视化的形式显示相关数据。
  10. 一种计算机可读存储介质,其上存储有计算机程序,其中,该程序被处理器执行时实现根据权利要求9所述的方法的步骤。
PCT/CN2022/073979 2022-01-26 2022-01-26 用于分析g蛋白偶联受体与配体相互作用的方法和系统 WO2023141808A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/073979 WO2023141808A1 (zh) 2022-01-26 2022-01-26 用于分析g蛋白偶联受体与配体相互作用的方法和系统

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/073979 WO2023141808A1 (zh) 2022-01-26 2022-01-26 用于分析g蛋白偶联受体与配体相互作用的方法和系统

Publications (1)

Publication Number Publication Date
WO2023141808A1 true WO2023141808A1 (zh) 2023-08-03

Family

ID=87470110

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/073979 WO2023141808A1 (zh) 2022-01-26 2022-01-26 用于分析g蛋白偶联受体与配体相互作用的方法和系统

Country Status (1)

Country Link
WO (1) WO2023141808A1 (zh)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108647487A (zh) * 2018-04-13 2018-10-12 华东师范大学 G蛋白偶联受体-配体相互作用关系的预测方法及预测系统
CN111133100A (zh) * 2017-07-05 2020-05-08 加利福尼亚大学董事会 多路复用的受体-配体相互作用筛选

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111133100A (zh) * 2017-07-05 2020-05-08 加利福尼亚大学董事会 多路复用的受体-配体相互作用筛选
CN108647487A (zh) * 2018-04-13 2018-10-12 华东师范大学 G蛋白偶联受体-配体相互作用关系的预测方法及预测系统

Similar Documents

Publication Publication Date Title
Liu et al. HITS-PR-HHblits: protein remote homology detection by combining PageRank and hyperlink-induced topic search
Krone et al. Visual analysis of biomolecular cavities: State of the art
Wang et al. fastDRH: a webserver to predict and analyze protein–ligand complexes based on molecular docking and MM/PB (GB) SA computation
Eswar et al. Tools for comparative protein structure modeling and analysis
Ho The top-cited research works in the Science Citation Index Expanded
Miller Chemical database techniques in drug discovery
Repasky et al. Flexible ligand docking with Glide
Gabel et al. Beware of Machine Learning-Based Scoring Functions On the Danger of Developing Black Boxes
Maisuradze et al. Free energy landscape of a biomolecule in dihedral principal component space: Sampling convergence and correspondence between structures and minima
Dos Santos et al. Practices in molecular docking and structure-based virtual screening
James et al. A visual and curatorial approach to clinical variant prioritization and disease gene discovery in genome-wide diagnostics
Sael et al. Detecting local ligand‐binding site similarity in nonhomologous proteins by surface patch comparison
Léonard et al. mulPBA: an efficient multiple protein structure alignment method based on a structural alphabet
Paxman et al. Bioinformatics tools and resources for analyzing protein structures
Angles et al. GSP4PDB: a web tool to visualize, search and explore protein-ligand structural patterns
Bhardwaj et al. Insight into structural features of phenyltetrazole derivatives as ABCG2 inhibitors for the treatment of multidrug resistance in cancer
Natarajan et al. Exploring icosahedral virus structures with VIPER
Cassidy et al. CryoEM-based hybrid modeling approaches for structure determination
Wei et al. A rapid solvent accessible surface area estimator for coarse grained molecular simulations
Du et al. Dockey: a modern integrated tool for large-scale molecular docking and virtual screening
Fang et al. MUFold-SSW: a new web server for predicting protein secondary structures, torsion angles and turns
Segura et al. 3DIANA: 3D domain interaction analysis: a toolbox for quaternary structure modeling
Evteev et al. SiteRadar: utilizing graph machine learning for precise mapping of protein–ligand-binding sites
WO2023141808A1 (zh) 用于分析g蛋白偶联受体与配体相互作用的方法和系统
CN114627960A (zh) 用于分析g蛋白偶联受体与配体相互作用的方法和系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22922659

Country of ref document: EP

Kind code of ref document: A1