CN116631502A - Antiviral drug screening method, system and storage medium based on hypergraph learning - Google Patents

Antiviral drug screening method, system and storage medium based on hypergraph learning Download PDF

Info

Publication number
CN116631502A
CN116631502A CN202310910294.0A CN202310910294A CN116631502A CN 116631502 A CN116631502 A CN 116631502A CN 202310910294 A CN202310910294 A CN 202310910294A CN 116631502 A CN116631502 A CN 116631502A
Authority
CN
China
Prior art keywords
matrix
virus
similarity matrix
drug
medicine
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310910294.0A
Other languages
Chinese (zh)
Inventor
汤永
王珊
李顺飞
刘建超
刘丽华
高笠雄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chinese PLA General Hospital
Original Assignee
Chinese PLA General Hospital
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chinese PLA General Hospital filed Critical Chinese PLA General Hospital
Priority to CN202310910294.0A priority Critical patent/CN116631502A/en
Publication of CN116631502A publication Critical patent/CN116631502A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B15/00ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
    • G16B15/30Drug targeting using structural data; Docking or binding prediction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references
    • G16H70/40ICT specially adapted for the handling or processing of medical references relating to drugs, e.g. their side effects or intended usage
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Chemical & Material Sciences (AREA)
  • Biophysics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Epidemiology (AREA)
  • Biotechnology (AREA)
  • Public Health (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Computational Mathematics (AREA)
  • Computing Systems (AREA)
  • Medicinal Chemistry (AREA)
  • Mathematical Analysis (AREA)
  • Bioethics (AREA)
  • Primary Health Care (AREA)
  • Toxicology (AREA)
  • Algebra (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Computational Linguistics (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The application provides an antiviral drug screening method, a system and a storage medium based on hypergraph learning, belonging to the technical field of intersection of bioinformatics, computational biology and artificial intelligence, wherein the method comprises the following steps: s1, constructing an adjacency matrix of virus-drug association; s2, calculating a viral Gaussian distance similarity matrix and a drug Gaussian distance similarity matrix; s3, calculating a virus gene sequence similarity matrix and a pharmaceutical chemical structure similarity matrix; s4, integrating to obtain a virus integration similarity matrix and a drug integration similarity matrix by using a rapid kernel learning method; s5, processing and integrating the similarity matrix by using a frequency spectrum offset method to obtain a corresponding kernel matrix; s6, constructing a loss function by using a dual least square method based on hypergraph learning, and obtaining a virus-medicine prediction score matrix by iterative solution; s7, screening and sequencing based on the virus-drug prediction score matrix to obtain a final prediction result. The application can screen out the effective therapeutic drug of virus with high efficiency and rapidness.

Description

Antiviral drug screening method, system and storage medium based on hypergraph learning
Technical Field
The application relates to the technical field of intersection of bioinformatics, computational biology and artificial intelligence, in particular to an antiviral drug screening method, an antiviral drug screening system and a antiviral drug storage medium based on hypergraph learning.
Background
The de novo development of new antiviral drugs is a long, risky and costly process. Therefore, if the existing medicines can be reused for treating viral diseases, the time can be greatly shortened, the cost can be reduced, and a new thought is provided for medicine screening based on a calculated medicine repositioning method. The reported methods can be divided into two types, namely a structure-based butt joint experiment screening method and a deep learning-based prediction method, but the method has the defects of strong dependence on structure information and poor interpretability. With the gradual development of artificial intelligence technology, the related data mining method can integrate and utilize the information of the disclosed biomedical database to efficiently and accurately finish the drug repositioning screening task.
Disclosure of Invention
The application provides an antiviral drug screening method, a system and a storage medium based on hypergraph learning, which can accurately and efficiently screen antiviral drugs according to virus-drug association, virus genome sequence and drug chemical structure data.
The first aspect of the embodiments of the present specification discloses an antiviral drug screening method based on hypergraph learning, comprising the steps of:
s1, constructing an adjacency matrix of virus-drug association;
s2, calculating a virus Gaussian distance similarity matrix and a drug Gaussian distance similarity matrix based on the adjacent matrix of the virus-drug association;
s3, calculating a virus gene sequence similarity matrix based on a virus genome sequence, and calculating a drug chemical structure similarity matrix based on a drug chemical structure;
s4, based on the viral Gaussian distance similarity matrix and the viral gene sequence similarity matrix, integrating by using a fast kernel learning method to obtain a viral integration similarity matrix; based on the Gaussian distance similarity matrix of the medicine and the chemical structure similarity matrix of the medicine, a rapid kernel learning method is used for integrating to obtain a medicine integration similarity matrix;
s5, processing the virus integration similar matrix and the drug integration similar matrix by using a frequency spectrum deviation method to obtain a virus kernel matrix and a drug kernel matrix;
s6, constructing a loss function based on the virus kernel matrix and the medicine kernel matrix by using a dual least square method based on hypergraph learning, and obtaining a virus-medicine prediction score matrix by iterative solution;
s7, screening out the scores of the rows of the target viruses based on the virus-medicine prediction score matrix, and sequencing to obtain a final prediction result;
in the embodiments disclosed in the present specification, in S1:
inputting a known virus-drug association pair to construct an adjacency matrix A of the virus-drug association;
if the correlation pair is known, the corresponding position is 1, otherwise, the correlation pair is 0;
the row number of the adjacent matrix A is the virus number nv, and the column number is the medicine number nd.
In the embodiments disclosed in the present specification, in S2:
if the association exists between the medicine d (i) and a certain virus, the corresponding position is marked as 1, otherwise, the corresponding position is marked as 0, a vector formed by 0 or 1 with the size of 1 Xnv is formed, the vector is marked as a vector spectrum IP (d (i)) of the medicine d (i), and then the Gaussian distance similarity between the medicine d (i) and the medicine d (j) is calculated:
in the above, the parameter gamma d For controlling the nuclear bandwidth by normalizing the new bandwidth parameter gamma' d Obtaining:
in a similar manner, the Gaussian distance similarity between viruses v (i) and v (j) is defined, a vector consisting of 0 or 1 in the size of 1×nd is obtained, denoted as vector spectrum IP (v (i)) of virus v (i), and the Gaussian distance similarity between viruses v (i) and v (j) is calculated:
parameter gamma v For controlling the nuclear bandwidth by normalizing the new bandwidth parameter gamma' v Obtaining:
above gamma' d And gamma' v Are constant.
In the embodiments disclosed in the present specification, in S3:
calculating a viral gene sequence similarity matrix based on the viral genome sequence by using a multiple sequence comparison method;
based on the chemical structure of the medicine, the MACS fingerprint of the medicine is obtained, and the chemical structure similarity matrix of the medicine is calculated by adopting the valley coefficient (namely Jaccard similarity).
In the embodiments disclosed in the present specification, in S4:
the semi-positive programming formula of the fast kernel learning method is as follows:
wherein, the first itemReconstructing a loss norm term to represent the integrated error magnitude of the similarity matrix; second item->As regularization term, the effect is to avoid overfitting; wherein A is virus-drug association adjacency matrix, S j v (j=1, 2) respectively represent a viral Gaussian distance similarity matrix and a viral gene sequence similarity matrix, μ v For regularization parameters, lambda v ∈R 1×2 For the coefficients to be solved, by lambda v Obtaining a virus integration similarity matrix:
similarly, the integrated parameter lambda of the pharmaceutical chemical structure similarity matrix and the pharmaceutical Gaussian distance similarity matrix can be obtained according to the above d ∈R 1×2 Drug integration similarity matrix is then calculated:
wherein S is j d (j=1, 2) represents a pharmaceutical gaussian distance similarity matrix and a pharmaceutical chemical structure similarity matrix, respectively.
In the embodiments disclosed in the present specification, in S5:
processing the virus integration similarity matrix and the drug integration similarity matrix by using a frequency spectrum shift method to obtain a virus nuclear matrix K * v And a drug core matrix K * d
The specific calculation method is as followsDecomposing the corresponding integrated similarity matrixSWhereinUIs an orthogonal matrix of the type that,Λis a diagonal matrix of real eigenvalues,Λ = diag(λ 1 , λ 2 , …, λ n ),λ min (S) Is an input matrixSIs a minimum feature value of (a). Processing virus integration similarity matrix by adopting frequency spectrum offset methodS v Matrix and drug integration similarityS d Matrix, the purpose of which is to not change the similarity between any two samplesOn the premise of (1) reinforcingS v AndS d matrix self-similarity. Decomposing the corresponding integrated similarity matrixSWhereinUIs an orthogonal matrix of the type that,Λis a diagonal matrix of real eigenvalues,Λ = diag(λ 1 , λ 2 , …, λ n ),λ min (S) Is an input matrixSIs a minimum feature value of (a). Processing virus integration similarity matrix by adopting frequency spectrum offset methodS v Matrix and drug integration similarityS d Matrix, the purpose of which is to strengthen the similarity between any two samples without changing the similarity between themS v AndS d matrix self-similarity. Decomposing the corresponding integrated similarity matrixSWhereinUIs an orthogonal matrix of the type that,Λis a diagonal matrix of real eigenvalues,Λ = diag(λ 1 , λ 2 , …, λ n ),λ min (S) Is an input matrixSIs a minimum feature value of (a). Processing virus integration similarity matrix by adopting frequency spectrum offset methodS v Matrix and drug integration similarityS d Matrix, the purpose of which is to strengthen the similarity between any two samples without changing the similarity between themS v AndS d matrix self-similarity.
In the embodiments disclosed in the present specification, in S6:
constructing a loss function by using a dual least square method based on hypergraph learning, and carrying out iterative solution to obtain a virus-drug prediction score matrix;
first, an objective function is constructed by using a dual least squares method based on hypergraph learning as follows:
item 1 inIs a reconstruction error term, which measures the difference between the prediction score matrix and the original association matrix, items 2 and 3 +.>Is a hypergraph regularization item, which retains high-order associated information, item 4 +.>Is L 2 (Tikhonov) regularization term, keeping coefficient matrix smooth to prevent overfitting; wherein a is the adjacency matrix (known) of virus-drug association pairs; k (K) * v ∈R nv nv× And K * d ∈R nd nd× The method is characterized in that the virus and the drug are respectively nuclear matrices (obtained by processing and integrating similarity matrices through a frequency spectrum shift method in the step S5); alpha v And alpha d Is a virus and drug coefficient matrix (to be solved);λ v λ d and β is the regularization coefficient (constant), |·|| F Is the Frobenius norm,tr() The trace of the matrix is indicated,Trepresenting a matrix transpose; l (L) v h The hypergraph Laplacian matrix for viruses, which is defined as +.>Wherein I is an identity matrix, H v For the hypergraph incidence matrix of the constructed virus, D ve 、D ev And D vw The calculation modes are that the degree diagonal matrix of the hypergraph vertex of the virus, the hyperedge diagonal matrix of the virus and the hyperedge weight matrix of the virus are respectively representedIn the followingd v δ v Andw v respectively obtaining row vectors of the virus hypergraph obtained by summing the vertex, the hyperedge and the hyperedge weight of the virus according to rows; l (L) d h The hypergraph Laplacian matrix for drugs, which is defined as +.>Wherein I is an identity matrix, H d For constructingHypergraph incidence matrix of medicine, D de 、D ed And D dw The calculation modes are +.A degree diagonal matrix of the hypergraph vertex, a hyperedge degree diagonal matrix of the medicine and a hyperedge weight matrix of the medicine are respectively represented>In the followingd d δ d Andw d respectively obtaining row vectors of the hypergraph of the medicine after row summation according to vertexes, supersides and supersides weights;
next, alternate operation alpha v And alpha d Until convergence:
obtaining a final predictive scoring matrix by combining matrices in two spaces of virus and drug
In the embodiments disclosed in the present specification, in S7:
and screening out the scores of the rows of the target viruses according to the virus-drug association pair prediction scores, and obtaining a final prediction result after sequencing.
The second aspect of the embodiment of the application discloses an antiviral drug screening system based on hypergraph learning, which comprises:
the adjacency matrix construction module is used for constructing an adjacency matrix of virus-drug association;
the Gaussian distance similarity matrix calculation module is used for calculating a viral Gaussian distance similarity matrix and a drug Gaussian distance similarity matrix based on the adjacent matrix of the virus-drug association;
the virus gene sequence similarity matrix and pharmaceutical chemical structure similarity matrix calculation module is used for calculating a virus gene sequence similarity matrix based on a virus genome sequence and calculating a pharmaceutical chemical structure similarity matrix based on a pharmaceutical chemical structure;
the integration similarity matrix calculation module is used for integrating the virus integration similarity matrix by using a fast kernel learning method based on the virus Gaussian distance similarity matrix and the virus gene sequence similarity matrix; based on the Gaussian distance similarity matrix of the medicine and the chemical structure similarity matrix of the medicine, a rapid kernel learning method is used for integrating to obtain a medicine integration similarity matrix;
the kernel matrix determining module is used for processing the virus integration similar matrix and the medicine integration similar matrix by using a frequency spectrum offset method to obtain a virus kernel matrix and a medicine kernel matrix;
the loss function construction module is used for constructing a loss function by using a dual least square method based on hypergraph learning based on the virus kernel matrix and the medicine kernel matrix;
the loss function solving module is used for solving the loss function to obtain a virus-medicine prediction score matrix;
the prediction module is used for screening out the scores of the rows of the target viruses based on the virus-medicine prediction score matrix, and obtaining a final prediction result after sequencing;
the processor is respectively connected with the adjacent matrix construction module, the Gaussian distance similarity matrix calculation module, the viral gene sequence similarity matrix and pharmaceutical chemical structure similarity matrix calculation module, the integration similarity matrix calculation module, the nuclear matrix determination module, the loss function construction module, the loss function solving module and the prediction module;
a memory coupled to the processor and storing a computer program executable on the processor;
when the processor executes the computer program, the processor controls the adjacent matrix construction module, the Gaussian distance similarity matrix calculation module, the virus gene sequence similarity matrix and pharmaceutical chemistry structure similarity matrix calculation module, the integration similarity matrix calculation module, the nuclear matrix determination module, the loss function construction module, the loss function solving module and the prediction module to work so as to realize the antiviral drug screening method based on hypergraph learning.
A third aspect of an embodiment of the present application discloses a computer-readable storage medium storing computer instructions that, when read by a computer, perform any one of the above-described hypergraph learning-based antiviral drug screening methods.
In summary, the application has at least the following advantages:
the application constructs an adjacent matrix of virus-drug association, and respectively calculates a virus Gaussian distance similarity matrix and a drug Gaussian distance similarity matrix; calculating a virus gene sequence similarity matrix by using a virus genome sequence, and calculating a drug chemical structure similarity matrix by using chemical structure information of a drug; calculating a virus integration similarity matrix and a drug integration similarity matrix by using a fast kernel learning method; constructing a loss function by combining a spectrum offset method and a dual least square method of hypergraph learning, carrying out iterative solution to obtain a virus-drug association prediction score matrix, and screening and sequencing to obtain a final result. The application can rapidly and efficiently screen out effective viral therapeutic drugs, overcomes the defects of long time consumption and high cost of biomedical experimental methods, and provides ideas for emergency solutions under specific conditions.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of steps of an antiviral drug screening method based on hypergraph learning according to the present application.
FIG. 2 is a flow chart of an antiviral drug screening method based on hypergraph learning according to the present application.
FIG. 3 is a graph showing the comparison of the results of the five-fold cross-validation of the hypergraph learning-based antiviral drug screening method and the baseline method according to the present application.
FIG. 4 is a schematic diagram of an antiviral drug screening system based on hypergraph learning according to the present application.
Detailed Description
Hereinafter, only certain exemplary embodiments are briefly described. As will be recognized by those of skill in the pertinent art, the described embodiments may be modified in numerous different ways without departing from the spirit or scope of the embodiments of the present application. Accordingly, the drawings and description are to be regarded as illustrative in nature and not as restrictive.
The following disclosure provides many different implementations, or examples, for implementing different configurations of embodiments of the application. In order to simplify the disclosure of embodiments of the present application, components and arrangements of specific examples are described below. Of course, they are merely examples and are not intended to limit embodiments of the present application. Furthermore, embodiments of the present application may repeat reference numerals and/or letters in the various examples, which are for the purpose of brevity and clarity, and which do not themselves indicate the relationship between the various embodiments and/or arrangements discussed.
Embodiments of the present application will be described in detail below with reference to the accompanying drawings.
It should be noted that the known human drug-virus association data used in the examples of this specification were collected from the literature concerned, and that 455 confirmed human virus-drug interactions were obtained after the literature-reported experimentally verified drug-virus interaction pairs were first sorted using text mining techniques, involving 34 viruses and 219 drugs (literature DOI:10.1016/j. Asoc. 2021.107135); the pharmaceutical chemistry is downloaded from the drug bank database and the viral genome nucleotide sequences are obtained from the NCBI database of the national center for biotechnology information.
As shown in fig. 1 and 2, a first aspect of embodiments of the present specification discloses an antiviral drug screening method based on hypergraph learning, comprising the steps of:
s1, constructing an adjacency matrix of virus-drug association.
Inputting a known virus-drug association pair to construct an adjacency matrix A of the virus-drug association;
the obtained adjacent matrix A element is 0 or 1, the size is 34 rows multiplied by 219 columns, and the value range of i and j is more than or equal to 1 and less than or equal to 34,1 and less than or equal to 219.
S2, calculating a virus Gaussian distance similarity matrix and a drug Gaussian distance similarity matrix based on an adjacent matrix of virus-drug association.
If the association exists between the medicine d (i) and a certain virus, the corresponding position is marked as 1, otherwise, the corresponding position is marked as 0, a vector formed by 0 or 1 with the size of 1 multiplied by 34 is formed, the vector spectrum is marked as a vector spectrum IP (d (i)) of the medicine d (i), and then the Gaussian distance similarity between the medicine d (i) and the medicine d (j) is calculated:
in the above formula, IP (d (j)) is the vector spectrum of the drug d (j); parameter gamma d For controlling the nuclear bandwidth by normalizing the new bandwidth parameter gamma' d Obtaining:
in a similar manner, defining the Gaussian distance similarity between the viruses v (i) and v (j), if the association exists between a certain virus v (i) and a certain medicine, marking the corresponding position as 1, otherwise marking the corresponding position as 0, forming a vector formed by 0 or 1 with the size of 1 multiplied by 219, marking the vector as a vector spectrum IP (v (i)) of the virus v (i), and then calculating the Gaussian distance similarity between the viruses v (i) and v (j):
in the above formula, IP (v (j)) is the vector spectrum of virus v (j), and parameter gamma v For controlling the nuclear bandwidth by normalizing the new bandwidth parameter gamma' v Obtaining:
above gamma' d And gamma' v Are all constant, take gamma' d =γ’ v =1。
Where nv denotes the number of viruses, in this case 34, nd denotes the number of drugs, in this case 219, which is calculated to give a symmetric matrix S of 34X 34 1 v (viral Gaussian distance similarity matrix) and a symmetric matrix S of 219×219 1 d (drug gaussian distance similarity matrix) and both matrix element values are between 0 and 1.
S3, calculating a viral gene sequence similarity matrix based on the viral genome sequence, and calculating a pharmaceutical chemical structure similarity matrix based on the pharmaceutical chemical structure.
Inputting viral genome sequence, and calculating to obtain viral gene sequence similarity matrix S by using a multi-sequence comparison tool MAFFT 2 v The method comprises the steps of carrying out a first treatment on the surface of the Inputting a chemical structure of a drug represented by SMILES codes, obtaining a molecular access system fingerprint (MACS) of the drug by using a chemical informatics software RDkit or Open Babel, and calculating Tanimoto similarity by using an R packet RxnSim to obtain a chemical structure similarity matrix S of the drug 2 d The specific calculation method is that for two medicines D (i) and D (j), the character string set of the binary representation of MACS fragments of the two medicines is respectively marked as the similarity S between D (i) and D (j) d ij The value can be calculated using the following formula:
s4, integrating to obtain a virus integration similarity matrix by using a fast kernel learning method based on the virus Gaussian distance similarity matrix and the virus gene sequence similarity matrix; based on the drug Gaussian distance similarity matrix and the drug chemical structure similarity matrix, a rapid kernel learning method is used for integration to obtain a drug integration similarity matrix.
The method comprises the steps of integrating a virus gene sequence similarity matrix and a virus Gaussian distance similarity matrix by using a fast kernel learning method, and specifically solving the following semi-positive programming:
wherein, the first itemReconstructing a loss norm term to represent the integrated error magnitude of the similarity matrix; second item->As regularization term, the effect is to avoid overfitting; wherein A is virus-drug association adjacency matrix, S j v (j=1, 2) respectively represent a viral Gaussian distance similarity matrix and a viral gene sequence similarity matrix, μ v For regularization parameters, lambda v ∈R 1×2 For coefficients to be solved, a CVX tool box in Matlab software is used for solving to obtain a virus integration similarity matrix:
similarly, the integrated parameter lambda of the pharmaceutical chemical structure similarity matrix and the pharmaceutical Gaussian distance similarity matrix can be obtained according to the above d ∈R 1×2 Drug integration similarity matrix is then calculated:
wherein S is j d (j=1, 2) represents a pharmaceutical gaussian distance similarity matrix and a pharmaceutical chemical structure similarity matrix, respectively.
S5, processing the virus integration similar matrix and the drug integration similar matrix by using a frequency spectrum offset method to obtain a corresponding virus nuclear matrix K * v And a drug core matrix K * d
The specific calculation method is as followsDecomposition ofCorresponding integration similarity matrixSWhereinUIs an orthogonal matrix of the type that,Λis a diagonal matrix of real eigenvalues,Λ = diag(λ 1 , λ 2 , …, λ n ),λ min (S) Is an input matrixSIs a minimum feature value of (a). Processing virus integration similarity matrix by adopting frequency spectrum offset methodS v Matrix and drug integration similarityS d Matrix, the purpose of which is to strengthen the similarity between any two samples without changing the similarity between themS v AndS d matrix self-similarity.
S6, constructing a loss function by using a dual least square method based on hypergraph learning, and obtaining a virus-medicine prediction score matrix by iterative solution;
first, an objective function is constructed by using a dual least squares method based on hypergraph learning as follows:
item 1 inIs a reconstruction error term, which measures the difference between the prediction score matrix and the original association matrix, items 2 and 3 +.>Is a hypergraph regularization item, which retains high-order associated information, item 4 +.>Is L 2 (Tikhonov) regularization term, keeping coefficient matrix smooth to prevent overfitting; wherein a is the adjacency matrix (known) of virus-drug association pairs; k (K) * v ∈R nv nv× And K * d ∈R nd nd× The method is characterized in that the virus and the drug are respectively nuclear matrices (obtained by processing and integrating similarity matrices through a frequency spectrum shift method in the step S5); alpha v And alpha d Is a matrix of virus and drug coefficientsTo be solved;λ v λ d and β is the regularization coefficient (constant), |·|| F Is the Frobenius norm,tr() The trace of the matrix is indicated,Trepresenting a matrix transpose; l (L) v h The hypergraph Laplacian matrix for viruses, which is defined as +.>Wherein I is an identity matrix, H v For the hypergraph incidence matrix of the constructed virus, D ve 、D ev And D vw The calculation modes are that the degree diagonal matrix of the hypergraph vertex of the virus, the hyperedge diagonal matrix of the virus and the hyperedge weight matrix of the virus are respectively representedIn the followingd v δ v Andw v respectively obtaining row vectors of the virus hypergraph obtained by summing the vertex, the hyperedge and the hyperedge weight of the virus according to rows; l (L) d h The hypergraph Laplacian matrix for drugs, which is defined as +.>Wherein I is an identity matrix, H d For the hypergraph incidence matrix of the constructed medicine, D de 、D ed And D dw The calculation modes are +.A degree diagonal matrix of the hypergraph vertex, a hyperedge degree diagonal matrix of the medicine and a hyperedge weight matrix of the medicine are respectively represented>In the followingd d δ d Andw d respectively obtaining row vectors of the hypergraph of the medicine after row summation according to vertexes, supersides and supersides weights;
next, the viral coefficient matrix α is optimized v When assuming a drug coefficient matrix alpha d Known, let ∂E(F )/∂α v =0, inverse solution obtainedα v The method comprises the steps of carrying out a first treatment on the surface of the Similarly, optimize alpha d Time hypothesis alpha v Known, let ∂E(F )/∂α d =0 to obtain α d The method comprises the steps of carrying out a first treatment on the surface of the Finally alternate running alpha v And alpha d Until convergence:
obtaining a final predictive scoring matrix by combining matrices in two spaces of virus and drug
S7, screening out the scores of the target viruses according to the virus-drug association pair prediction scores, and sequencing to obtain a final prediction result.
When the algorithm is realized by Matlab programming, regularization parameters are selected after preliminary optimizationβ=1,λ v =1,λ d =0.25; in the above process, matrix alpha v Initialized to a random matrix of 34 rows by 219 columns, α d Initialized to a random matrix of 219 rows by 34 columns, all elements being within the (0, 1) interval, matrix K * v The size is 34 rows by 34 columns, K * d The size is 219 rows by 219 columns; setting cycles 100 times or increments less than 10 -6 And exiting at the time, and obtaining a matrix alpha after the iterative operation is finished v And alpha d The method comprises the steps of carrying out a first treatment on the surface of the Calculating a predictive score matrixAnd obtaining a final prediction result, and ending the prediction.
The validity of the application is verified:
the method for screening the antiviral drugs based on hypergraph learning as shown in fig. 1 and fig. 2 adopts five-fold cross validation to evaluate the prediction performance, and the specific implementation mode is as follows: all known drug-virus associations are firstly divided into 5 groups at random, then each of the 5 groups is sequentially set as a test sample, and other groups are used as training samples (when the selection conditions of the test samples are different, the Gaussian distance similarity matrix calculated by depending on the test samples is changed). The training samples are used as inputs to the method to obtain a predicted result, and finally the predicted score of each test sample in the set is compared with the score of the candidate sample. To reduce the impact of random partitioning on the results during the generation of test samples, 100 five-fold cross-validation was performed.
The following data were obtained after calculation using Matlab programming, as shown in fig. 3, which is a comparison of AUROC (area under ROC curve) values between DHRLSVDA of the present method and several virus-drug screening models that have been reported. The method obtains AUROC values of 0.9372 +/-0.0048 in five-fold cross validation, and shows more excellent prediction performance than several classical models.
On the other hand, the method is used for predicting a specific virus, such as a novel coronavirus (SARS-CoV-2), and the row corresponding to the SARS-CoV-2 in the scoring matrix is screened to obtain the prediction score of the novel coronal related drugs, and 18 of the first 20 drugs can be supported by the reported literature after the descending order of the prediction score.
The table below shows the predicted results for the first 20 drug names and PMID numbers of the supporting literature.
Sequence number Drug name Support evidence
1 Chloroquine PMID:33906514
2 Alisporivir PMID:32376613
3 Camostat PMID:35692220
4 Lopinavir PMID:32251767
5 Remdesivir PMID:32251767,35221670
6 Mycophenolic Acid PMID:32579258
7 Mefloquine PMID:35620103
8 Astemizole PMID:33932547
9 Moxifloxacin PMID:33063271
10 Gemcitabine PMID:32432977
11 Ribavirin PMID:33689451
12 Betulinic Acid PMID:35835344
13 Tacrolimus Is not found temporarily
14 Niclosamide PMID:34664162
15 Memantine PMID:32828269
16 Terconazole PMID:32817221
17 Fluspirilene PMID:34311539
18 Imatinib PMID:35388061
19 6-Azauridine Is not found temporarily
20 N4-Hydroxycytidine PMID:35492218
In summary, the application has the advantages that:
1. by introducing L 2 The norm constraint term prevents over fitting, reduces the adverse effect of noise data, and enables the virus-drug association prediction result to be more accurate and more robust;
2. the hypergraph learning theory is introduced, the hypergraph Laplacian term is fused, the high-dimensional manifold structure is fully depicted, and the prediction performance is improved by using hypergraph high-order information.
As shown in fig. 4, a second aspect of the embodiment of the present application discloses an antiviral drug screening system based on hypergraph learning, comprising:
the adjacency matrix construction module is used for constructing an adjacency matrix of virus-drug association;
the Gaussian distance similarity matrix calculation module is used for calculating a viral Gaussian distance similarity matrix and a drug Gaussian distance similarity matrix based on the adjacent matrix of the virus-drug association;
the virus gene sequence similarity matrix and pharmaceutical chemical structure similarity matrix calculation module is used for calculating a virus gene sequence similarity matrix based on a virus genome sequence and calculating a pharmaceutical chemical structure similarity matrix based on a pharmaceutical chemical structure;
the integration similarity matrix calculation module is used for integrating the virus integration similarity matrix by using a fast kernel learning method based on the virus Gaussian distance similarity matrix and the virus gene sequence similarity matrix; based on the Gaussian distance similarity matrix of the medicine and the chemical structure similarity matrix of the medicine, a rapid kernel learning method is used for integrating to obtain a medicine integration similarity matrix;
the kernel matrix determining module is used for processing the virus integration similar matrix and the medicine integration similar matrix by using a frequency spectrum offset method to obtain a virus kernel matrix and a medicine kernel matrix;
the loss function construction module is used for constructing a loss function by using a dual least square method based on hypergraph learning based on the virus kernel matrix and the medicine kernel matrix;
the loss function solving module is used for solving the loss function to obtain a virus-medicine prediction score matrix;
the prediction module is used for screening out the scores of the rows of the target viruses based on the virus-medicine prediction score matrix, and obtaining a final prediction result after sequencing;
the processor is respectively connected with the adjacent matrix construction module, the Gaussian distance similarity matrix calculation module, the viral gene sequence similarity matrix and pharmaceutical chemical structure similarity matrix calculation module, the integration similarity matrix calculation module, the nuclear matrix determination module, the loss function construction module, the loss function solving module and the prediction module;
a memory coupled to the processor and storing a computer program executable on the processor;
when the processor executes the computer program, the processor controls the adjacent matrix construction module, the Gaussian distance similarity matrix calculation module, the virus gene sequence similarity matrix and pharmaceutical chemistry structure similarity matrix calculation module, the integration similarity matrix calculation module, the nuclear matrix determination module, the loss function construction module, the loss function solving module and the prediction module to work so as to realize the antiviral drug screening method based on hypergraph learning.
A third aspect of an embodiment of the present application discloses a computer-readable storage medium storing computer instructions that, when read by a computer, perform any one of the above-described hypergraph learning-based antiviral drug screening methods.
The above embodiments are provided to illustrate the present application and not to limit the present application, so that the modification of the exemplary values or the replacement of equivalent elements should still fall within the scope of the present application.
From the foregoing detailed description, it will be apparent to those skilled in the art that the present application can be practiced without these specific details, and that the present application meets the requirements of the patent statutes.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the application. The foregoing description of the preferred embodiment of the application is not intended to be limiting, but rather to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the application.
It should be noted that the above description of the flow is only for the purpose of illustration and description, and does not limit the application scope of the present specification. Various modifications and changes to the flow may be made by those skilled in the art under the guidance of this specification. However, such modifications and variations are still within the scope of the present description.
While the basic concepts have been described above, it will be apparent to those of ordinary skill in the art after reading this application that the above disclosure is by way of example only and is not intended to be limiting. Although not explicitly described herein, various modifications, improvements, and adaptations of the application may occur to one of ordinary skill in the art. Such modifications, improvements, and modifications are intended to be suggested within the present disclosure, and therefore, such modifications, improvements, and adaptations are intended to be within the spirit and scope of the exemplary embodiments of the present disclosure.
Meanwhile, the present application uses specific words to describe embodiments of the present application. For example, "one embodiment," "an embodiment," and/or "some embodiments" means a particular feature, structure, or characteristic in connection with at least one embodiment of the application. Thus, it should be emphasized and should be appreciated that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various positions in this specification are not necessarily referring to the same embodiment. Furthermore, certain features, structures, or characteristics of one or more embodiments of the application may be combined as suitable.
Furthermore, those of ordinary skill in the art will appreciate that aspects of the application are illustrated and described in the context of a number of patentable categories or conditions, including any novel and useful processes, machines, products, or materials, or any novel and useful improvements thereof. Accordingly, aspects of the present application may be implemented entirely in hardware, entirely in software (including firmware, resident software, micro-code, etc.) or a combination of hardware and software. The above hardware or software may be referred to as a "unit," module, "or" system. Furthermore, aspects of the present application may take the form of a computer program product embodied in one or more computer-readable media, wherein the computer-readable program code is embodied therein.
Computer program code required for operation of portions of the present application may be written in any one or more programming languages, including an object oriented programming language such as Java, scala, smalltalk, eiffel, JADE, emerald, C ++, C#, VB.NET, python, etc., a conventional programming language such as C programming language, visualBasic, fortran2103, perl, COBOL2102, PHP, ABAP, a dynamic programming language such as Python, ruby and Groovy, or other programming languages, etc. The program code may execute entirely on the user's computer, or as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any form of network, such as a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet), or the use of services such as software as a service (SaaS) in a cloud computing environment.
Furthermore, the order in which the elements and sequences are presented, the use of numerical letters, or other designations are used in the application is not intended to limit the sequence of the processes and methods unless specifically recited in the claims. While certain presently useful inventive embodiments have been discussed in the foregoing disclosure, by way of example, it is to be understood that such details are merely illustrative and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover all modifications and equivalent arrangements included within the spirit and scope of the embodiments of the application. For example, while the implementation of the various components described above may be embodied in a hardware device, it may also be implemented as a purely software solution, e.g., an installation on an existing server or mobile device.
Likewise, it should be noted that in order to simplify the presentation of the disclosure and thereby aid in understanding one or more inventive embodiments, various features are sometimes grouped together in a single embodiment, figure, or description thereof. This method of disclosure, however, is not to be interpreted as reflecting an intention that the claimed subject matter requires more features than are expressly recited in each claim. Rather, the inventive subject matter should be provided with fewer features than the single embodiments described above.

Claims (3)

1. The antiviral drug screening method based on hypergraph learning is characterized by comprising the following steps of:
s1, constructing an adjacency matrix of virus-drug association;
s2, calculating a virus Gaussian distance similarity matrix and a drug Gaussian distance similarity matrix based on the adjacent matrix of the virus-drug association;
s3, calculating a virus gene sequence similarity matrix based on a virus genome sequence, and calculating a drug chemical structure similarity matrix based on a drug chemical structure;
s4, based on the viral Gaussian distance similarity matrix and the viral gene sequence similarity matrix, integrating by using a fast kernel learning method to obtain a viral integration similarity matrix; based on the Gaussian distance similarity matrix of the medicine and the chemical structure similarity matrix of the medicine, a rapid kernel learning method is used for integrating to obtain a medicine integration similarity matrix;
s5, processing the virus integration similar matrix and the drug integration similar matrix by using a frequency spectrum deviation method to respectively obtain a virus core matrix and a drug core matrix;
s6, constructing a loss function based on the virus kernel matrix and the medicine kernel matrix by using a dual least square method based on hypergraph learning, and obtaining a virus-medicine prediction score matrix by iterative solution;
s7, screening out the scores of the rows of the target viruses based on the virus-medicine prediction score matrix, and sequencing to obtain a final prediction result;
the specific implementation method of the S1 is as follows:
inputting a known virus-drug association pair to construct an adjacency matrix A of the virus-drug association;
if the correlation pair is known, the corresponding position is 1, otherwise, the correlation pair is 0;
the row number of the adjacent matrix A is the virus number nv, and the column number is the medicine number nd;
the specific implementation method of the S2 is as follows:
if the association exists between the medicine d (i) and a certain virus, the corresponding position is marked as 1, otherwise, the corresponding position is marked as 0, a vector formed by 0 or 1 with the size of 1 Xnv is formed, the vector spectrum IP (d (i)) of the medicine d (i) is marked, and nv is the number of viruses; the gaussian distance similarity between drugs d (i) and d (j) is then calculated:
;
in the above formula, IP (d (j)) is the vector spectrum of the drug d (j); parameter gamma d For controlling the nuclear bandwidth by normalizing the new bandwidth parameter gamma' d Obtaining:
;
wherein nd is the number of drugs; in a similar manner, the Gaussian distance similarity between viruses v (i) and v (j) is defined, a vector consisting of 0 or 1 in the size of 1×nd is obtained, denoted as vector spectrum IP (v (i)) of virus v (i), and the Gaussian distance similarity between viruses v (i) and v (j) is calculated:
;
wherein IP (v (j)) is the vector spectrum of virus v (j); parameter gamma v For controlling the nuclear bandwidth by normalizing the new bandwidth parameter gamma' v Obtaining:
;
above gamma' d And gamma' v Are all constant;
the specific implementation method of the S3 is as follows:
calculating a viral gene sequence similarity matrix based on the viral genome sequence by using a multiple sequence comparison method;
based on the chemical structure of the medicine, obtaining a medicine MACS fingerprint, and calculating a medicine chemical structure similarity matrix by adopting valley coefficients;
the specific implementation method of the S4 is as follows:
the semi-positive programming formula of the fast kernel learning method is as follows:
;
wherein, the first itemReconstructing a loss norm term to represent the integrated error magnitude of the similarity matrix; second itemAs regularization term, the effect is to avoid overfitting; wherein A is virus-drug association adjacency matrix, S j v (j=1, 2) respectively represent a viral Gaussian distance similarity matrix and a viral gene sequence similarity matrix, μ v For regularization parameters, lambda v ∈R 1×2 For the coefficients to be solved, by lambda v Obtaining a virus integration similarity matrix S v
;
Similarly, the integrated parameter lambda of the pharmaceutical chemical structure similarity matrix and the pharmaceutical Gaussian distance similarity matrix can be obtained according to the above d ∈R 1×2 Then calculate the drug integration similarity matrix S d
;
Wherein S is j d (j=1, 2) represents a pharmaceutical gaussian distance similarity matrix and a pharmaceutical chemical structure similarity matrix, respectively;
the specific implementation method of the S5 is as follows:
processing the virus integration similarity matrix and the drug integration similarity matrix by using a frequency spectrum shift method to obtain a virus nuclear matrix K * v And a drug core matrix K * d The specific calculation method is as followsDecomposing the integrated similarity matrixSWhereinUIs an orthogonal matrix of the type that,Λis a diagonal matrix of real eigenvalues,Λ = diag(λ 1 , λ 2 , …, λ n ),λ min (S) Is an input matrixSIs a minimum feature value of (2);
the specific implementation method of the S6 is as follows:
first, an objective function is constructed by using a dual least squares method based on hypergraph learning as follows:
;
item 1 inIs a reconstruction error term, which measures the difference between the prediction score matrix and the original association matrix, items 2 and 3 +.>Is a hypergraph regularization item, which retains high-order associated information, item 4 +.>Is L 2 Regularization term, keeping coefficient matrix smooth to prevent over fitting; wherein a is an adjacency matrix of virus-drug association pairs; k (K) * v ∈R nv nv× And K * d ∈R nd nd× The virus core matrix and the medicine core matrix are respectively; alpha v And alpha d Respectively a viral coefficient matrix and a drug coefficient matrix to be solved;λ v λ d and beta is the regularization coefficient, I.I F Is the Frobenius norm,tr() The trace of the matrix is indicated,Trepresenting a matrix transpose; l (L) v h The hypergraph Laplacian matrix, which is a virus, is defined asWherein I is an identity matrix, H v For the hypergraph incidence matrix of the constructed virus, D ve 、D ev And D vw The calculation mode is +.>In the followingd v δ v Andw v respectively obtaining row vectors of the virus hypergraph obtained by summing the vertex, the hyperedge and the hyperedge weight of the virus according to rows; l (L) d h The hypergraph Laplacian matrix for drugs, which is defined as +.>Wherein I is an identity matrix, H d Hypergraph of constructed drugsUnimatrix D de 、D ed And D dw The calculation modes of the degree diagonal matrix respectively representing the hypergraph vertexes of the medicines, the hyperedge degree diagonal matrix of the medicines and the hyperedge weight matrix of the medicines are as followsIn the middle ofd d δ d Andw d respectively obtaining row vectors of the hypergraph of the medicine after row summation according to vertexes, supersides and supersides weights;
next, alternate operation alpha v And alpha d Until convergence:
obtaining a final predictive scoring matrix by combining matrices in two spaces of virus and drug
2. Antiviral drug screening system based on hypergraph study, characterized by comprising:
the adjacency matrix construction module is used for constructing an adjacency matrix of virus-drug association;
the Gaussian distance similarity matrix calculation module is used for calculating a viral Gaussian distance similarity matrix and a drug Gaussian distance similarity matrix based on the adjacent matrix of the virus-drug association;
the virus gene sequence similarity matrix and pharmaceutical chemical structure similarity matrix calculation module is used for calculating a virus gene sequence similarity matrix based on a virus genome sequence and calculating a pharmaceutical chemical structure similarity matrix based on a pharmaceutical chemical structure;
the integration similarity matrix calculation module is used for integrating the virus integration similarity matrix by using a fast kernel learning method based on the virus Gaussian distance similarity matrix and the virus gene sequence similarity matrix; based on the Gaussian distance similarity matrix of the medicine and the chemical structure similarity matrix of the medicine, a rapid kernel learning method is used for integrating to obtain a medicine integration similarity matrix;
the kernel matrix determining module is used for processing the virus integration similar matrix and the medicine integration similar matrix by using a frequency spectrum deviation method to respectively obtain a virus kernel matrix and a medicine kernel matrix;
the loss function construction module is used for constructing a loss function by using a dual least square method based on hypergraph learning based on the virus kernel matrix and the medicine kernel matrix;
the loss function solving module is used for solving the loss function to obtain a virus-medicine prediction score matrix;
the prediction module is used for screening out the scores of the rows of the target viruses based on the virus-medicine prediction score matrix, and obtaining a final prediction result after sequencing;
the processor is respectively connected with the adjacent matrix construction module, the Gaussian distance similarity matrix calculation module, the viral gene sequence similarity matrix and pharmaceutical chemical structure similarity matrix calculation module, the integration similarity matrix calculation module, the nuclear matrix determination module, the loss function construction module, the loss function solving module and the prediction module;
a memory coupled to the processor and storing a computer program executable on the processor;
wherein when the processor executes the computer program, the processor controls the adjacency matrix construction module, the Gaussian distance similarity matrix calculation module, the virus gene sequence similarity matrix and pharmaceutical chemistry structure similarity matrix calculation module, the integration similarity matrix calculation module, the nuclear matrix determination module, the loss function construction module, the loss function solving module and the prediction module to work so as to realize the antiviral drug screening method based on hypergraph learning as claimed in claim 1.
3. A computer-readable storage medium storing computer instructions that, when read by a computer, perform the hypergraph learning-based antiviral drug screening method of claim 1.
CN202310910294.0A 2023-07-24 2023-07-24 Antiviral drug screening method, system and storage medium based on hypergraph learning Pending CN116631502A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310910294.0A CN116631502A (en) 2023-07-24 2023-07-24 Antiviral drug screening method, system and storage medium based on hypergraph learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310910294.0A CN116631502A (en) 2023-07-24 2023-07-24 Antiviral drug screening method, system and storage medium based on hypergraph learning

Publications (1)

Publication Number Publication Date
CN116631502A true CN116631502A (en) 2023-08-22

Family

ID=87642202

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310910294.0A Pending CN116631502A (en) 2023-07-24 2023-07-24 Antiviral drug screening method, system and storage medium based on hypergraph learning

Country Status (1)

Country Link
CN (1) CN116631502A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020095276A1 (en) * 1999-11-30 2002-07-18 Li Rong Intelligent modeling, transformation and manipulation system
CN107507195A (en) * 2017-08-14 2017-12-22 四川大学 The multi-modal nasopharyngeal carcinoma image partition methods of PET CT based on hypergraph model
CN111523582A (en) * 2020-04-16 2020-08-11 厦门大学 Trans-instrument Raman spectrum qualitative analysis method based on transfer learning
CN115966252A (en) * 2023-02-12 2023-04-14 汤永 Antiviral drug screening method based on L1norm diagram
CN116092598A (en) * 2023-01-31 2023-05-09 汤永 Antiviral drug screening method based on manifold regularized non-negative matrix factorization
CN116153391A (en) * 2023-04-19 2023-05-23 中国人民解放军总医院 Antiviral drug screening method, system and storage medium based on joint projection
CN116189760A (en) * 2023-04-19 2023-05-30 中国人民解放军总医院 Matrix completion-based antiviral drug screening method, system and storage medium
CN116230077A (en) * 2023-02-20 2023-06-06 汤永 Antiviral drug screening method based on restarting hypergraph double random walk

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020095276A1 (en) * 1999-11-30 2002-07-18 Li Rong Intelligent modeling, transformation and manipulation system
CN107507195A (en) * 2017-08-14 2017-12-22 四川大学 The multi-modal nasopharyngeal carcinoma image partition methods of PET CT based on hypergraph model
CN111523582A (en) * 2020-04-16 2020-08-11 厦门大学 Trans-instrument Raman spectrum qualitative analysis method based on transfer learning
CN116092598A (en) * 2023-01-31 2023-05-09 汤永 Antiviral drug screening method based on manifold regularized non-negative matrix factorization
CN115966252A (en) * 2023-02-12 2023-04-14 汤永 Antiviral drug screening method based on L1norm diagram
CN116230077A (en) * 2023-02-20 2023-06-06 汤永 Antiviral drug screening method based on restarting hypergraph double random walk
CN116153391A (en) * 2023-04-19 2023-05-23 中国人民解放军总医院 Antiviral drug screening method, system and storage medium based on joint projection
CN116189760A (en) * 2023-04-19 2023-05-30 中国人民解放军总医院 Matrix completion-based antiviral drug screening method, system and storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
HONGPENG YANG 等: "Identifying potential association on gene-disease network via dual hypergraph regularized least squares", 《BMC GENOMICS》, pages 1 - 16 *
YIHUA CHEN 等: "Learning Kernels from Indefinite Similarities", 《PROCEEDINGS OF THE 26THINTERNATIONAL CONFERENCE ON MACHINE LEARNING》, pages 145 - 152 *
罗自炎;祁力群;: "半正定张量", 中国科学:数学, no. 05 *

Similar Documents

Publication Publication Date Title
CN116189760B (en) Matrix completion-based antiviral drug screening method, system and storage medium
CN116153391B (en) Antiviral drug screening method, system and storage medium based on joint projection
Su et al. Deep-Resp-Forest: a deep forest model to predict anti-cancer drug response
Stephenson et al. Survey of machine learning techniques in drug discovery
Ramsundar et al. Massively multitask networks for drug discovery
CN116092598B (en) Antiviral drug screening method based on manifold regularized non-negative matrix factorization
CN115966252B (en) Antiviral drug screening method based on L1norm diagram
CN116230077B (en) Antiviral drug screening method based on restarting hypergraph double random walk
CN114913916A (en) Drug relocation method for predicting new coronavirus adaptive drugs
CN116631537B (en) Antiviral drug screening method, system and storage medium based on fuzzy learning
Lin et al. Machine learning in neural networks
Tan et al. Current advances and limitations of deep learning in anticancer drug sensitivity prediction
CN116631502A (en) Antiviral drug screening method, system and storage medium based on hypergraph learning
Das et al. Effective prediction of drug–target interaction on HIV using deep graph neural networks
Nguyen et al. A matrix completion method for drug response prediction in personalized medicine
Ren et al. De novo prediction of Cell-Drug sensitivities using deep learning-based graph regularized matrix factorization
CN116798545B (en) Antiviral drug screening method, system and storage medium based on non-negative matrix
CN116759015B (en) Antiviral drug screening method and system based on hypergraph matrix tri-decomposition
CN116759016A (en) Antiviral drug screening method, system and storage medium based on least square method
CN116705148B (en) Antiviral drug screening method and system based on Laplace least square method
Alghamdi et al. A prediction modelling and pattern detection approach for the first-episode psychosis associated to cannabis use
Li et al. Understanding sequence conservation with deep learning
Wang et al. Predicting RBP binding sites of RNA with high-order encoding features and CNN-BLSTM hybrid model
Ghorbanali et al. DRP-VEM: Drug repositioning prediction using voting ensemble
Testa et al. A Non-Negative Matrix Tri-Factorization Based Method for Predicting Antitumor Drug Sensitivity

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination