CN114974406B - Training method, system, device and product of antiviral drug repositioning model - Google Patents

Training method, system, device and product of antiviral drug repositioning model Download PDF

Info

Publication number
CN114974406B
CN114974406B CN202210512683.3A CN202210512683A CN114974406B CN 114974406 B CN114974406 B CN 114974406B CN 202210512683 A CN202210512683 A CN 202210512683A CN 114974406 B CN114974406 B CN 114974406B
Authority
CN
China
Prior art keywords
drug
virus
data
model
rule
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210512683.3A
Other languages
Chinese (zh)
Other versions
CN114974406A (en
Inventor
宋欣雨
贾志龙
何昆仑
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chinese PLA General Hospital
Original Assignee
Chinese PLA General Hospital
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chinese PLA General Hospital filed Critical Chinese PLA General Hospital
Priority to CN202210512683.3A priority Critical patent/CN114974406B/en
Publication of CN114974406A publication Critical patent/CN114974406A/en
Application granted granted Critical
Publication of CN114974406B publication Critical patent/CN114974406B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B15/00ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
    • G16B15/30Drug targeting using structural data; Docking or binding prediction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Chemical & Material Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Animal Behavior & Ethology (AREA)
  • Medicinal Chemistry (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Databases & Information Systems (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The application discloses a training method, a system, a device and a product of an antiviral drug repositioning model, wherein the training method of the antiviral drug repositioning model comprises the following steps: acquiring the associated data between the medicine and the virus through a database; inputting association data between the drug and the virus into an antiviral drug relocation framework, the framework consisting of a number of models; constructing a rule model, constructing a knowledge graph of the drugs and the viruses based on the associated data between the drugs and the viruses, and acquiring corresponding rules between the drugs and the viruses based on algorithms such as random walk and the like; constructing a prediction model, constructing a deep neural network based on the associated data and the rule weight between the drugs and the viruses, and generating a prediction score of the antiviral drugs; and constructing a multi-task learning model, and cooperatively training a rule model and a prediction model to obtain the optimal rule weight. And transmitting the virus data and the drug data to an antiviral drug relocation framework to complete training. The model has high accuracy and strong interpretability.

Description

Training method, system, device and product of antiviral drug repositioning model
Technical Field
The present application relates generally to the medical field, and more particularly to a method, system, apparatus and product for training an antiviral drug repositioning model.
Background
Currently, there is an increasing interest in research on antiviral drugs, and viruses are highly parasitic, depend on the transcription, translation and energy metabolism systems of host cells for survival and replication, and perform processes from invasion to replication to assembly lysis by hijacking the host cells, and are harmful to the human body.
In the prior art, multisource data integration is poor in antiviral drug relocation research, model interpretability is poor, and drug relocation methods with similar single properties such as chemical structures, side effects or targets based on drug transcription reactions can influence the transcription reactions of drugs due to the fact that drugs with the same target have different chemical structures, different cell lines or drug doses, and inevitable noise causes deviation of research results.
Disclosure of Invention
In view of the above-mentioned deficiencies or inadequacies in the prior art, it would be desirable to provide a method, system, apparatus, and article of manufacture for training an antiviral drug relocation model.
In one aspect, the present application provides a method for training an antiviral drug relocation model, comprising:
acquiring the associated data between the medicine and the virus through a database;
inputting the association data between the drug and virus into an antiviral drug relocation framework, the antiviral drug relocation framework being composed of a number of models;
constructing a rule model, and acquiring a corresponding rule between the medicine and the virus based on the association data between the medicine and the virus, wherein the rule is an association attribute between the medicine and the virus;
constructing a prediction model, generating a prediction score of the antiviral drug based on the association data and the weight between the drug and the virus, and analyzing the antiviral possibility of the drug based on the prediction score, wherein the weight between the drug and the virus is obtained based on the association attribute between the drug and the virus;
constructing a multi-task learning model, cooperatively training the rule model and the prediction model, and enabling the rule model and the prediction model to share weight so as to obtain optimal weight;
transmitting virus data and drug data into the antiviral drug relocation framework to complete training of the antiviral drug relocation model.
Further, the constructing a rule model, and acquiring a rule between the corresponding drug and the corresponding virus based on the associated data between the drug and the virus, specifically:
the rule is generated based on a sequence of relationships of two entity paths.
Further, the generating the rule based on the relationship sequence of the two entity paths further includes:
the rule learning module is used for generating a plurality of preset rules based on the associated data between the medicines and the viruses;
and the rule screening module screens out a preset rule based on the hard threshold screening method and/or the soft threshold screening method.
Further, the prediction model includes:
an input layer for acquiring the virus data and the drug data;
a hidden layer that generates a vector based on the virus data and the drug data;
an output layer that generates a prediction score for the anti-viral drug based on an output function of the hidden layer, the vector, a weight between the drug and a virus.
Further, the input layer is used for acquiring the virus data and the drug data, and specifically includes: viral transcriptome expression profiles and viral transcriptome expression profiles.
Further, the association data between the drug and the virus further comprises one or more of the following data: drug and virus knowledge map data, virus and drug transcriptome expression map data.
Further, before obtaining the association data between the drug and the virus from the database, the method further includes:
based on PubMed, EMBASE, wiley and a drug database, a drug and target association relationship, a virus and drug association relationship, a virus and host protein association relationship, a protein and protein association relationship and a virus and virus family association relationship are constructed to generate association data between drugs and viruses.
In a second aspect, the present application provides a system for training an antiviral drug relocation model, the system comprising:
an acquisition module configured to acquire, via a database, association data between a drug and a virus;
an input module configured to input association data between the drug and virus into an antiviral drug relocation architecture, the antiviral drug relocation architecture being built from a number of models;
the rule module is configured to obtain a rule between the corresponding medicine and the corresponding virus based on the association data between the medicine and the virus, wherein the rule is an association attribute between the medicine and the virus;
a prediction module configured to generate a prediction score for the anti-viral drug based on the correlation data and a weight between the drug and the virus, wherein the weight between the drug and the virus is derived based on the correlation attributes between the drug and the virus, and to analyze a drug antiviral likelihood based on the prediction score;
a multi-task learning module configured to cooperatively train the rule model and the prediction model and make the rule model and the prediction model share a weight to obtain an optimal weight;
a training module configured to transmit virus data and drug data into the antiviral drug relocation architecture to complete training of the antiviral drug relocation model.
In a third aspect, the present application provides an apparatus for training an antiviral drug relocation model, including a processor, a memory, and at least one instruction, at least one program, a set of codes, or a set of instructions stored in the memory, wherein the instruction, the program, the set of codes, or the set of instructions is loaded and executed by the processor to implement the method for training an antiviral drug relocation model according to any one of the embodiments of the present application.
In a fourth aspect, the present application provides a computer program product, wherein the instructions of the computer program product, when executed by a processor of a terminal, enable the terminal to perform the method for training an antiviral drug relocation model as described in any one of the embodiments of the present application.
In conclusion, based on the training method, the training system, the training device and the training product for the antiviral drug repositioning model, the accuracy of the prediction model is improved and the objectivity of antiviral drug research is ensured by constructing the rule model, the prediction model and the multi-task learning model.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:
FIG. 1 is a flow chart of a method of training an antiviral drug relocation model provided by an embodiment of the present application;
fig. 2 is a schematic diagram of an entity a and an entity b provided by an embodiment of the present application;
FIG. 3 is a schematic structural diagram of a prediction model provided in an embodiment of the present application;
FIG. 4 is a block diagram of an antiviral drug relocation model training system provided in an embodiment of the present application;
FIG. 5 is a schematic diagram of an exemplary embodiment of an antiviral drug repositioning model training apparatus;
FIG. 6 is a schematic representation of a drug and virus knowledge map provided in an embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not to be construed as limiting the invention. It should be noted that, for convenience of description, only the portions related to the present invention are shown in the drawings.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
This application may relate to the general field of medicine to facilitate the study of antiviral drugs, and the following examples of the application illustrate training methods for an antiviral drug relocation model.
With particular reference to fig. 1, the present application provides a method for training an antiviral drug relocation model, comprising:
s101, acquiring association data between the medicine and the virus through a database.
Specifically, the association relation among the mined drugs, proteins and viruses is obtained through a preset database, and the association data among the drugs and viruses is constructed. For example, the management data may be drug and virus knowledge maps.
In some embodiments, the association data between the drug and the virus further comprises one or more of: drug and virus knowledge map data, virus and drug transcriptome expression map data.
Specifically, the relationship between the drug and the virus is obtained in various ways to determine the completeness and completeness of the data.
In some embodiments, before obtaining the association data between the drug and the virus through the database, the method further comprises:
based on PubMed, EMBASE, wiley and a drug database, a drug and target association relationship, a virus and drug association relationship, a virus and host protein association relationship, a protein and protein association relationship and a virus and virus family association relationship are constructed to generate association data between drugs and viruses.
Specifically, based on various chemical literature bases such as PubMed, EMBASE, wiley and the like, virus-related databases, drug-related databases and knowledge sources, a drug-target association relationship, a virus-drug association relationship, a virus-host protein association relationship, a protein-protein association relationship and a virus-virus family association relationship are constructed for generating association data between drugs and viruses, wherein the association data between the drugs and the viruses comprises drug and virus knowledge maps.
By way of example, drug-virus associations, in the examples drug and virus associations mined from drug banks and literature, have PubChem CID as the unique identifier of the drug and Taxonomy ID in the NCBI Taxonomy database as the unique identifier to generate association data between the drug and virus based on the unique identifier.
The drug-target association relationship provided by databases such as drug bank, TTD, STITCH, DGIdb, matador, and CTD is used in this embodiment, pubChem CID is used as the unique identifier of the drug, uniProt ID is used as the unique identifier of the target, and association data between the drug and the virus is generated based on the unique identifier.
The virus-drug association relationship provided by the drug bank database is used in this embodiment, and PubChem CID is used as the unique identifier of the drug, so as to generate association data between the drug and the virus based on the unique identifier.
For virus-host interaction associations, the VirHostNet and VirusMentha databases are used in this example, and are highly recognized and relatively comprehensive databases of virus-host protein associations at present. First, the present study retrieves human protein and viral protein interaction pairs from these two databases with UniProt ID as the unique identifier for the protein to generate association data between drugs and viruses based on the unique identifier.
For the human protein-protein relationship, in this example, the human protein-protein relationship provided by databases such as BioGRID, huRI, INstruct, MINT, PINA, signallink, and innateDB is used, and UniProt ID is used as a unique identifier of a protein to generate association data between a drug and a virus based on the unique identifier.
For the association relationship between viruses and virus families, the association relationship between viruses and virus families provided by the international classification of viruses and naming principles formulated by the international committee for virus classification (ICTV) is used in the present embodiment, and the association relationship between viruses and virus families is represented by domain-boundary-phylum (subgenome) -class-order (subdirectory) -family (subfamily) -genus-species, with Taxonomy ID in the NCBI Taxonomy database as a unique identifier, so as to generate association data between drugs and viruses based on the unique identifier.
S102, inputting the related data between the medicine and the virus into an antiviral medicine repositioning framework, wherein the antiviral medicine repositioning framework is built by a plurality of models.
Specifically, the antiviral drug repositioning framework is composed of a rule model, a prediction model and a multi-task learning model, and relevant data between the drugs and the viruses are input into the framework composed of the multi-model so as to enable the framework to operate.
S103, a rule model is constructed, and corresponding rules between the drugs and the viruses are obtained based on the association data between the drugs and the viruses, wherein the rules are association attributes between the drugs and the viruses.
Specifically, a rule model is constructed, and rules (namely, association attributes) between the drugs and the viruses are obtained based on pre-acquired drug and virus association data (namely, drug and virus knowledge maps). For example, as shown in fig. 6, based on the association attributes of the drug and the virus knowledge graph, rule 1 of drug 1-virus 2 is "virus-drug, virus-family 1", and the weight thereof is 0.3 according to the corresponding relationship between the rule and the weight; rule 2 is "drug-target, virus-target" and the weight is 0.7 according to the correspondence between the rule and the weight. Different relations have different weights, and the associated attribute is specifically the relation sequence of the path.
In some embodiments, the building a rule model and obtaining a rule between the corresponding drug and the corresponding virus based on the association data between the drug and the virus specifically include:
the rule is generated based on a sequence of relationships of two entity paths.
Specifically, the drug and virus association data (i.e., the drug and virus knowledge map) may be obtained through various methods, and in this embodiment, the rule is obtained by using a relationship sequence connecting two entity paths. By way of example, P k =E 1 r 1 E 2 r 2 E 3 Is a strip E 1 To E 3 Rule R = R 1 r 2 The method is more concerned with the relation between two connected entities, and facilitates the discovery of potential relations between the entities.
In some embodiments, the generating the rule based on a sequence of relationships of two entity paths further comprises:
the rule learning module is used for generating a plurality of preset rules based on the association data between the medicines and the viruses;
and the rule screening module screens out a preset rule based on the hard threshold screening method and/or the soft threshold screening method.
Specifically, the corresponding rule is obtained through a rule learning module and a rule screening module.
The rule learning module is configured to obtain a plurality of preset rules, for example, in the process of obtaining the rule between the drug and the virus, there may be a plurality of obtained rules, as shown in fig. 2, there are three paths between the entities a and b, P 1 =r 1 cr 2 、P 2 =r 1 dr 2 And P 3 =r 3 er 4 Then, rule R 1 =r 1 r 2 And R 2 =r 3 r 4
First, a rule feature vector representation for acquiring nodes a to b based on a rule R is defined as follows:
Figure BDA0003638403350000071
wherein R' is ∈ R 1 …r k-1 ,R'∈r 1 …r k-1 Represents a set of nodes, P (b | e, r), to the rule-based arrival node k ) Representation is based on rules r k From node e to node b. The one-step random walk probability calculation formula is as follows:
Figure BDA0003638403350000072
wherein:
Figure BDA0003638403350000073
taking fig. 2 as an example, all regular feature vectors based on nodes a to b are:
F(b,a|R)=P(c|a,r 1 )·P(b|c,r 2 )+P(a|d,r 1 )·P(b|d,r 2 )
then, all rule feature vectors based on node-to-node are defined as follows:
x(a,b)=[F(b|a,R 1 ),…,F(b|a,R n )] T
the rule screening module is used for screening the rules in the rule learning module to obtain the most useful rules in the rule learning module. In the present application, a soft threshold method is used to screen the rules. The soft threshold screening method mainly utilizes evaluation functions such as chi-square, linear regression and Sigmoid functions to carry out multi-task learning on a plurality of rules, so that useful information in each rule is obtained.
The screening rules adopted in the application are as follows:
chi-square objective function:
Figure BDA0003638403350000081
linear regression objective function:
Figure BDA0003638403350000082
sigmoid objective function:
Figure BDA0003638403350000083
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003638403350000084
rule A is a rule preset by the researcher, x (a,b) Denotes all rules, w, from entity a to entity b i Representing the weight of each rule.
In other embodiments, a hard threshold screening method may also be employed to evaluate rule importance.
S104, constructing a prediction model, generating a prediction score of the antiviral drug based on the association data and the weight between the drug and the virus, and analyzing the antiviral possibility of the drug based on the prediction score, wherein the weight between the drug and the virus is obtained based on the association attribute between the drug and the virus.
Specifically, based on the rule weight, the deep neural network is used for predicting the relationship between the medicine and the virus, and a prediction score is generated, wherein the prediction score can be used for analyzing the antiviral feasibility of the medicine, and the higher the score is, the higher the feasibility is.
In some embodiments, the predictive model comprises:
an input layer for acquiring the virus data and the drug data;
a hidden layer that generates a vector based on the virus data and the drug data;
an output layer that generates a prediction score of the anti-virus drug based on an output function of the hidden layer, the vector, a weight between the drug and a virus, wherein the weight is obtained in advance based on a corresponding rule.
Specifically, as shown in fig. 3, the prediction model is composed of an input layer, a hidden layer, and an output layer.
To illustrate, for drugs, this example uses the Drug perturbed transcriptome expression profile data in the L1000 dataset of the LINCS program as input for Drug data, denoted Drug (u); for viruses, transcriptome expression profile data of Virus infected hosts in the GEO and GEN databases were extracted for this study as input for Virus data, denoted as Virus (v).
First, drug (u) and Virus (v) are input to the input layer, and then training is performed using positive samples, which will
Figure BDA0003638403350000091
Inputting the input vector into the hidden layer to obtain a hidden layer vector N ui (ii) a Finally, the output function of the hidden layer of the last layer is set as a sigma output function and a rule-based weight function f W Obtaining a predicted score y for the antiviral drug ui . The sigma output function may be a Sigmoid function, or other functions.
Obtaining a prediction module antiviral drug prediction score as follows: y is ui =f W (σN ui F (i, u | R)) wherein F W (a,b)=a+w T b。
The objective function of the prediction module is as follows:
Figure BDA0003638403350000092
where F (i, u | R) is the rule feature vector of the rule module.
In some embodiments, the input layer is configured to obtain the virus data and the drug data, and specifically: viral transcriptome expression profiles and viral transcriptome expression profiles.
Specifically, the transcriptome expression profile data disturbed by the medicine is input into the input layer to serve as medicine data; the data of the expression profile of the transcriptome of the virus-infected host was used as the virus data. For outputting a predicted score for the antiviral drug. And aiming at the expression profile data of the virus transcriptome and the expression profile data of the drug transcriptome, collecting and preprocessing virus-infected transcriptome expression profile data stored in databases such as GEO and GEN and drug-disturbed transcriptome expression profile data of an L1000 data set in an LINCS plan, storing in a distributed manner, and finally constructing a database with rapid retrieval and access.
S105, constructing a multi-task learning model, cooperatively training the rule model and the prediction model, and enabling the rule model and the prediction model to share weight so as to obtain optimal weight.
Specifically, a rule model and a prediction model are trained cooperatively, and the weight of the trained rule is transmitted to the rule model and the prediction model, so that the prediction model and the rule module share the weight. The prediction score can be explained by obtaining a rule with higher weight, so that the interpretability of the whole model and the reliability of the weight are improved. And the rule model and the prediction model which are trained cooperatively can acquire the optimal rule, namely the optimal weight, so that the accuracy of the model is improved.
For example, the rule module and the prediction module are subjected to multi-task learning, and the objective function is as follows:
Figure BDA0003638403350000101
wherein V represents a parameter of the prediction module, W represents a shared weight of the rule module and the prediction module, O r Representing the target function of the rule module, O l Represents the objective function of the prediction module and λ represents the scaling parameter.
S106, transmitting the virus data and the drug data to the antiviral drug relocation framework so as to complete the training of the antiviral drug relocation model.
Specifically, drug and virus data are input into a relocation framework for training to obtain an available antiviral drug relocation model.
For example, for candidate antiviral drugs, dengue virus (DENV) is selected for implementation, and a therapeutic index between the candidate antiviral drugs and the viruses is calculated to evaluate the relationship between the candidate drugs and the viruses, so as to verify the accuracy of the antiviral drug relocation model. And the information integration capability and the prediction accuracy of the whole model are improved.
In conclusion, based on the training method of the antiviral drug repositioning model, the accuracy of the prediction model is improved and the objectivity of antiviral drug research is ensured by constructing the rule model, the prediction model and the multi-task learning model.
With further reference to FIG. 4, a schematic diagram of a training system 200 for an antiviral drug repositioning model is shown, according to one embodiment of the present application.
An obtaining module 210 configured to obtain, through a database, association data between a drug and a virus;
an input module 220 configured for inputting association data between the drug and virus into an antiviral drug relocation architecture, the antiviral drug relocation architecture being built from a number of models;
a rule module 230 configured to obtain a rule between the corresponding drug and the virus based on the association data between the drug and the virus, where the rule is an association attribute between the drug and the virus;
a prediction module 240 configured to generate a prediction score of the antiviral drug based on the correlation data and a weight between the drug and the virus, wherein the weight between the drug and the virus is obtained based on the correlation attribute between the drug and the virus, and to analyze the antiviral possibility of the drug based on the prediction score;
a multitask learning module 250 configured to cooperatively train the rule model and the prediction model and make the rule model and the prediction model share a weight to obtain an optimal weight;
a training module 260 configured to transmit virus data and drug data into the antiviral drug relocation architecture to complete training of the antiviral drug relocation model.
Several modules or units have been mentioned in the above detailed description, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operational instructions of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. The foregoing description is only exemplary of the preferred embodiments of the application and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure herein is not limited to the particular combination of features described above, but also encompasses other arrangements formed by any combination of the above features or their equivalents without departing from the spirit of the disclosure. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.
With further reference to FIG. 5, a schematic diagram of a training apparatus 300 for an antiviral drug repositioning model is shown, according to one embodiment of the present application.
The execution main body of the training method for the antiviral drug relocation model in this embodiment is a training device for the antiviral drug relocation model, the training device for the antiviral drug relocation model may be implemented in a software and/or hardware manner, the training device for the antiviral drug relocation model in this embodiment may be configured in an electronic device, or may be configured in a server for controlling the electronic device, and the server communicates with the electronic device to further control the electronic device.
The electronic device in this embodiment may include, but is not limited to, a personal computer, a platform computer, a smart phone, and the like, and the electronic device is not particularly limited in this embodiment.
The training apparatus 300 for an antiviral drug relocation model of the present embodiment comprises a processor and a memory, the processor and the memory being connected to each other, wherein the memory is used for storing a computer program, the computer program comprises program instructions, and the processor is configured to invoke the program instructions to execute the method according to any one of the above.
In the embodiment of the present application, the processor is a processing device having a function of performing a logic operation, for example, a Central Processing Unit (CPU), a field programmable logic array (FPGA), a Digital Signal Processor (DSP), a single chip Microcomputer (MCU), an application specific logic circuit (ASIC), an image processor (GPU), and the like having a data processing capability and/or a program execution capability. It will be readily appreciated that the processor is typically communicatively coupled to the memory, and that any combination of one or more computer program products may be stored on the memory, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. Volatile memory can include, for example, random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, read Only Memory (ROM), a hard disk, an Erasable Programmable Read Only Memory (EPROM), USB memory, flash memory, and the like. One or more computer instructions may be stored on the memory and executed by the processor to implement the associated analysis functions. Various applications and various data, such as various data used and/or generated by the applications, may also be stored in the computer-readable storage medium.
In the embodiment of the present application, each module may be implemented by a processor executing related computer instructions, for example, the obtaining module may be implemented by the processor executing the obtained instructions, the input module may be implemented by the processor executing instructions of a rule model, and the neural network may be implemented by the processor executing instructions of a neural network algorithm.
In the embodiment of the present application, each module may run on the same processor, or may run on multiple processors; the modules can run on a processor of the same architecture, for example, all run on a processor of an X86 architecture, or run on processors of different architectures, for example, an image processing module runs on a CPU of an X86 architecture, and a machine learning module runs on a GPU. Each module can be packaged in one computer product, for example, each module is packaged in one computer software and runs on one computer (server), or can be packaged in different computer products respectively or partially, for example, the image processing module is packaged in one computer software and runs on one computer (server), and the machine learning modules are packaged in separate computer software and runs on another computer (server); the computing platform for executing each module can be local computing, cloud computing, or hybrid computing formed by local computing and cloud computing.
The computer system includes a Central Processing Unit (CPU) 301, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 302 or a program loaded from a storage section 308 into a Random Access Memory (RAM) 303. In the RAM303, various programs and data necessary for operation instructions of the system are also stored. The CPU301, ROM302, and RAM303 are connected to each other via a bus 304. An input/output (I/O) interface 305 is also connected to bus 304.
The following components are connected to the I/O interface 305; an input portion 306 including a keyboard, a mouse, and the like; an output section 307 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 308 including a hard disk and the like; and a communication section 309 including a network interface card such as a LAN card, a modem, or the like. The communication section 309 performs communication processing via a network such as the internet. A drive 310 is also connected to the I/O interface 305 as needed. A removable medium 311 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 310 as necessary, so that a computer program read out therefrom is mounted into the storage section 308 as necessary.
In particular, according to embodiments of the present application, the process described above with reference to the flowchart fig. 1 may be implemented as a computer software program. For example, embodiments of the present application include a computer program product comprising a computer program embodied on a computer-readable medium, the computer program comprising program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program comprises program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 309, and/or installed from the removable medium 311. The above-described functions defined in the system of the present application are executed when the computer program is executed by the Central Processing Unit (CPU) 301.
An electronic device provided by the embodiment of the application is provided with a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and the computer program is executed by a processor to implement the method according to any one of the above.
It should be noted that the computer readable medium shown in the present application may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
In one embodiment, a computer program product is provided, the instructions in which, when executed by a processor of an electronic device, enable a training apparatus of an antiviral drug relocation model to perform the steps of: acquiring the associated data between the medicine and the virus through a database;
inputting the association data between the drug and virus into an antiviral drug relocation framework, the antiviral drug relocation framework being composed of a number of models;
constructing a rule model, and acquiring a corresponding rule between the medicine and the virus based on the association data between the medicine and the virus, wherein the rule is an association attribute between the medicine and the virus;
constructing a prediction model, generating a prediction score of the antiviral drug based on the association data and the weight between the drug and the virus, and analyzing the antiviral possibility of the drug based on the prediction score, wherein the weight between the drug and the virus is obtained based on the association attribute between the drug and the virus;
constructing a multi-task learning model, cooperatively training the rule model and the prediction model, and enabling the rule model and the prediction model to share weight so as to obtain optimal weight;
transmitting virus data and drug data into the antiviral drug relocation framework to complete training of the antiviral drug relocation model.
It will be understood that the terms "length," "width," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," and the like, as used herein, refer to an orientation or positional relationship indicated in the drawings that is solely for the purpose of facilitating the description and simplifying the description, and do not indicate or imply that the device or element being referred to must have a particular orientation, be constructed and operated in a particular orientation, and is therefore not to be construed as limiting the invention.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or to implicitly indicate the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present invention, "a plurality" means two or more unless specifically defined otherwise.
Unless defined otherwise, technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. Terms such as "disposed" and the like, as used herein, may refer to one element being directly attached to another element or one element being attached to another element through intervening elements. Features described herein in one embodiment may be applied to another embodiment, either alone or in combination with other features, unless the feature is otherwise inapplicable or otherwise stated in the other embodiment.
The present invention has been described in terms of the above embodiments, but it should be understood that the above embodiments are for purposes of illustration and description only and are not intended to limit the invention to the scope of the described embodiments. It will be appreciated by those skilled in the art that many variations and modifications are possible in light of the above teaching and are within the scope of the invention as claimed.

Claims (7)

1. A method of training an antiviral drug relocation model, comprising:
acquiring the associated data between the medicine and the virus through a database;
inputting the association data between the drug and virus into an antiviral drug relocation framework, the antiviral drug relocation framework being composed of a number of models;
constructing a rule model, acquiring a rule between the corresponding drug and the corresponding virus based on the association data between the drug and the virus, wherein the rule is an association attribute between the drug and the virus, generating the rule based on a relationship sequence of two entity paths, and generating the rule based on the relationship sequence of the two entity paths, and the method further comprises the following steps: the rule learning module is used for generating a plurality of preset rules based on the associated data between the medicines and the viruses; the rule screening module screens out the most useful preset rule based on a hard threshold screening method and/or a soft threshold screening method, wherein the preset rule is obtained by performing multi-task learning on a plurality of rules based on a chi-square objective function, a linear regression function and a Sigmoid function so as to obtain useful information in the rules;
constructing a prediction model, generating a prediction score of the antiviral drug based on the association data and the weight between the drug and the virus, and analyzing the antiviral possibility of the drug based on the prediction score, wherein the weight between the drug and the virus is obtained based on the association attribute between the drug and the virus;
constructing a multi-task learning model, cooperatively training the rule model and the prediction model, and enabling the rule model and the prediction model to share weight so as to obtain optimal weight;
transmitting virus data and drug data into the antiviral drug relocation framework to complete training of the antiviral drug relocation model;
wherein the predictive model comprises:
an input layer for acquiring the virus data and the drug data;
a hidden layer that generates a vector based on the virus data and the drug data;
an output layer to generate a prediction score for the anti-viral drug based on an output function of the hidden layer, the vector, a weight between the drug and a virus.
2. The method according to claim 1, wherein the input layer is used for acquiring the virus data and the drug data, in particular: viral transcriptome expression profiles and viral transcriptome expression profiles.
3. The method of claim 1, wherein the association data between the drug and the virus further comprises one or more of: drug and virus knowledge map data, virus and drug transcriptome expression profile data.
4. The method of claim 1, wherein prior to obtaining the association data between the drug and the virus via the database, further comprising:
based on PubMed, EMBASE, wiley and a drug database, a drug and target association relationship, a virus and drug association relationship, a virus and host protein association relationship, a protein and protein association relationship and a virus and virus family association relationship are constructed to generate association data between drugs and viruses.
5. A training system for an antiviral drug relocation model, said system comprising:
an acquisition module configured to acquire, via a database, association data between a drug and a virus;
an input module configured to input association data between the drug and virus into an antiviral drug relocation architecture, the antiviral drug relocation architecture being built from a number of models;
a building rule model module configured to obtain a rule between the corresponding drug and the virus based on the association data between the drug and the virus, where the rule is an association attribute between the drug and the virus, the rule is generated based on a relationship sequence of two entity paths, and the building rule model module further includes: the rule learning module is used for generating a plurality of preset rules based on the associated data between the medicines and the viruses; the rule screening module screens out the most useful preset rule based on a hard threshold screening method and/or a soft threshold screening method, wherein the preset rule is obtained by performing multi-task learning on a plurality of rules based on a chi-square objective function, a linear regression function and a Sigmoid function so as to obtain useful information in the rules;
a building prediction model module configured to generate a prediction score of the antiviral drug based on the association data and a weight between the drug and the virus, wherein the weight between the drug and the virus is obtained based on an association attribute between the drug and the virus, and analyze the antiviral possibility of the drug based on the prediction score;
a multi-task learning module configured to cooperatively train the rule model and the prediction model and to share a weight between the rule model and the prediction model to obtain an optimal weight;
a training module configured to transmit virus data and drug data into the antiviral drug relocation architecture to complete training of the antiviral drug relocation model;
wherein the predictive model comprises:
an input layer for acquiring the virus data and the drug data;
a hidden layer that generates a vector based on the virus data and the drug data;
an output layer to generate a prediction score for the anti-viral drug based on an output function of the hidden layer, the vector, a weight between the drug and a virus.
6. An apparatus for training an antiviral drug relocation model, comprising a processor, a memory, said memory having stored therein at least one instruction, at least one program, set of codes, or set of instructions, said instruction, said program, said set of codes, or said set of instructions being loaded and executed by said processor to implement the method for training an antiviral drug relocation model according to any of claims 1-4.
7. A computer storage medium having stored thereon a computer program, which when executed by a processor implements a method of training an antiviral drug relocation model as claimed in any one of claims 1 to 4.
CN202210512683.3A 2022-05-11 2022-05-11 Training method, system, device and product of antiviral drug repositioning model Active CN114974406B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210512683.3A CN114974406B (en) 2022-05-11 2022-05-11 Training method, system, device and product of antiviral drug repositioning model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210512683.3A CN114974406B (en) 2022-05-11 2022-05-11 Training method, system, device and product of antiviral drug repositioning model

Publications (2)

Publication Number Publication Date
CN114974406A CN114974406A (en) 2022-08-30
CN114974406B true CN114974406B (en) 2023-04-14

Family

ID=82981696

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210512683.3A Active CN114974406B (en) 2022-05-11 2022-05-11 Training method, system, device and product of antiviral drug repositioning model

Country Status (1)

Country Link
CN (1) CN114974406B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114093527A (en) * 2021-12-01 2022-02-25 中国科学院新疆理化技术研究所 Drug relocation method and system based on spatial similarity constraint and non-negative matrix factorization
CN114242186A (en) * 2021-12-30 2022-03-25 湖南大学 Chinese and western medicine relocation method and system fusing GHP and GCN and storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109411019B (en) * 2018-12-12 2020-05-05 中国人民解放军军事科学院军事医学研究院 Medicine prediction method, device, server and storage medium
CN111081316A (en) * 2020-03-25 2020-04-28 元码基因科技(北京)股份有限公司 Method and device for screening new coronary pneumonia candidate drugs
CN111696685A (en) * 2020-06-04 2020-09-22 大连理工大学 Medicine repositioning method for new coronavirus treatment medicine and application thereof
CN113948160A (en) * 2020-07-15 2022-01-18 武汉Tcl集团工业研究院有限公司 Drug screening method, device and storage medium
CN111916145B (en) * 2020-07-24 2022-03-11 湖南大学 Novel coronavirus target prediction and drug discovery method based on graph representation learning

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114093527A (en) * 2021-12-01 2022-02-25 中国科学院新疆理化技术研究所 Drug relocation method and system based on spatial similarity constraint and non-negative matrix factorization
CN114242186A (en) * 2021-12-30 2022-03-25 湖南大学 Chinese and western medicine relocation method and system fusing GHP and GCN and storage medium

Also Published As

Publication number Publication date
CN114974406A (en) 2022-08-30

Similar Documents

Publication Publication Date Title
Wang et al. AI in health: state of the art, challenges, and future directions
Piri et al. Feature selection using artificial gorilla troop optimization for biomedical data: A case analysis with COVID-19 data
CN104572583B (en) Method and system for data densification
Qahtan et al. Review of healthcare industry 4.0 application-based blockchain in terms of security and privacy development attributes: Comprehensive taxonomy, open issues and challenges and recommended solution
JP7285893B2 (en) MEDICAL DATA VERIFICATION METHOD, DEVICE AND ELECTRONIC DEVICE
Stenwig et al. Comparative analysis of explainable machine learning prediction models for hospital mortality
US20230034559A1 (en) Automated prediction of clinical trial outcome
Asif et al. Enhancing heart disease prediction through ensemble learning techniques with hyperparameter optimization
Rajabi et al. Knowledge graphs and explainable ai in healthcare
JP2022106287A (en) Affinity prediction method and model training method, equipment, device, and medium
Ramón et al. eXtreme Gradient Boosting-based method to classify patients with COVID-19
Ma et al. A dual graph neural network for drug–drug interactions prediction based on molecular structure and interactions
Liu et al. Detection of protein complexes from multiple protein interaction networks using graph embedding
Su et al. Detection of pulmonary embolism severity using clinical characteristics, hematological indices, and machine learning techniques
Derevitskii et al. Hybrid Bayesian network-based modeling: COVID-19-pneumonia case
Seo et al. Prediction of neurologically intact survival in cardiac arrest patients without pre-hospital return of spontaneous circulation: machine learning approach
CN113220895A (en) Information processing method and device based on reinforcement learning and terminal equipment
CN114974406B (en) Training method, system, device and product of antiviral drug repositioning model
de Paiva et al. Potential and limitations of machine meta-learning (ensemble) methods for predicting COVID-19 mortality in a large inhospital Brazilian dataset
CN115658877A (en) Medicine recommendation method and device based on reinforcement learning, electronic equipment and medium
CN116434976A (en) Drug repositioning method and system integrating multisource knowledge-graph
Yu et al. HLGNN-MDA: heuristic learning based on graph neural networks for miRNA–disease association prediction
CN113220896B (en) Multi-source knowledge graph generation method, device and terminal equipment
CN115472257A (en) Method and device for recruiting users, electronic equipment and storage medium
CN115206421A (en) Drug repositioning method, and repositioning model training method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant