CN117079762A

CN117079762A - Drug effect prediction model training method, drug effect prediction method and device thereof

Info

Publication number: CN117079762A
Application number: CN202311239893.0A
Authority: CN
Inventors: 张莹莹; 吴贤; 张渝; 郑冶枫
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2023-09-25
Filing date: 2023-09-25
Publication date: 2023-11-17
Anticipated expiration: 2043-09-25
Also published as: CN117079762B

Abstract

Drug effect prediction model training methods, drug effect prediction methods, devices therefor, and computer-readable media are disclosed. The training method of the drug effect prediction model comprises the following steps: inputting a dataset into a drug effect prediction model, the dataset comprising a plurality of samples, each sample comprising at least one first entry, at least one second entry, and a real label; acquiring a knowledge graph associated with the drug in each first item, the knowledge graph comprising a vector representation of at least one entity associated with the drug; inputting the vector representation of the at least one first entry and the at least one entity associated with the drug into a first encoder to obtain a first entry enhancement vector representation; inputting at least one second entry into a second encoder to obtain a second entry vector representation; calculating a mutual attention representation between the first item enhancement vector representation and the second item vector representation; inputting the mutual attention representation into a multi-layer sensor to obtain a medicine effect prediction result; a target loss is determined.

Description

Drug effect prediction model training method, drug effect prediction method and device thereof

Technical Field

The present disclosure relates to the field of computer technology, and in particular, to a drug effect prediction model training method, a drug effect prediction method, an apparatus, a computing device, a computer-readable storage medium, and a computer program product.

Background

Medication planning is critical for patients suffering from serious or life threatening conditions. Appropriate medication may reduce mortality, shorten hospital stays, and increase health assessment scores. In clinical practice, medication plans are often subject to deviations and errors, which occur particularly when decisions must be made quickly in an intensive care unit for life threatening situations. It can be seen that the effectiveness of the medication regimen (e.g., 24 hours prior to patient admission) has an important role in assisting the physician in quickly making the correct decision.

However, existing drug effectiveness prediction efforts are often focused on a particular drug, a particular disease, or a particular test, and are difficult to extend to general drugs and diseases in the intensive care unit ICU (Intensive Care Unit) setting of a hospital.

Disclosure of Invention

In view of the above, the present disclosure provides a drug effect prediction model training method, a drug effect prediction method, an apparatus, a computing device, a computer-readable storage medium, and a computer program product to alleviate, mitigate, or even eliminate the above-mentioned problems.

According to a first aspect of the present invention, there is provided a training method of a drug effect prediction model for predicting a drug effect, comprising: inputting a dataset into the drug effect prediction model, the dataset comprising a plurality of samples, each sample comprising at least one first entry, at least one second entry, and a true label for the sample; obtaining a knowledge-graph associated with the drug in each of the at least one first item, the knowledge-graph comprising a vector representation of at least one entity associated with the drug; inputting the vector representation of the at least one first entry and the at least one entity associated with the drug into a first encoder to obtain a first entry enhancement vector representation; inputting the at least one second entry into a second encoder to obtain a second entry vector representation; calculating a mutual attention representation between the first item enhancement vector representation and the second item vector representation; inputting the mutual attention representation into a multi-layer sensor to obtain a medicine effect prediction result; and determining target loss of the drug effect prediction model based on the drug effect prediction result and the real label, carrying out iterative updating on parameters of the drug effect prediction model so that the target loss meets preset conditions, and determining the drug effect prediction model after updating the parameters as the drug effect prediction model.

In some embodiments, for each sample, each first entry includes at least the medication name and the method of administration of the medication, each second entry includes at least one diagnostic disease information, and the real label includes at least a probability of death.

In some embodiments, inputting the vector representation of the at least one first entry and the at least one entity associated with the drug into the first encoder to obtain a first entry enhancement vector representation comprises: inputting each first entry of the at least one first entry into a first sub-encoder to obtain a first entry vector representation; stitching a vector representation of at least one entity associated with the drug with the first entry vector representation, resulting in a stitched first entry vector representation; and inputting the at least one stitched first entry vector representation into a second sub-encoder, the second sub-encoder deriving a first entry enhancement vector representation based on the at least one stitched first entry vector representation, wherein the first and second sub-encoders are included in the first encoder, the at least one entity associated with the drug comprising at least one disease entity associated with the drug and other drug entities similar to the drug.

In some embodiments, the vector representation of the at least one entity associated with the drug is obtained by at least: and obtaining a vector representation of at least one disease associated with the drug in each first item and a vector representation of other drugs similar to the drug in the knowledge graph, and carrying out average pooling on the vector representation of the at least one disease and the vector representation of the other drugs to obtain a vector representation of at least one entity associated with the drug.

In some embodiments, wherein the target loss of the model comprises a first loss and a second loss, the first loss being a cross entropy function between the drug effect prediction result and the real label; the second loss is calculated based on the steps of: comparing the drug effect prediction result with the real label, and dividing the sample into a first set and a second set based on whether the drug effect prediction result is consistent with the real label; and performing contrast learning on the first set and the second set, and calculating the second loss based on contrast learning loss of the contrast learning.

In some embodiments, the drug effect prediction results include at least one of predicting hospitalization mortality, predicting whether a length of hospitalization is greater than a predetermined number of days, and predicting a health score.

In some embodiments, calculating a mutual attention representation between the first item enhancement vector representation and the second item vector representation comprises: the first entry enhancement vector representation is noted asc，The second entry vector representation is denoted asxCalculating a mutual attention representation between the first item enhancement vector representation and the second item vector representation based on the formula, the mutual attention representation comprising a representation of the attention of a medication based on information related to the diagnostic diseaseThe information on the diagnosis of the disease is based on the representation of the drug treatment +.>：/>WhereinAttentionAs a function of attention.

According to a second aspect of the present invention, there is provided a drug effect prediction method, the drug effect prediction model being for predicting a drug effect, comprising: inputting a first entry comprising at least the drug name and the method of administration of the drug and a second entry comprising at least one piece of diagnostic disease information into a drug effect prediction model for predicting a drug effect; outputting a medicine effect prediction result; wherein the drug effect prediction model is trained based on the steps of: inputting a dataset into the drug effect prediction model, the dataset comprising a plurality of samples, each sample comprising at least one first entry, at least one second entry, and a true label for the sample; obtaining a knowledge-graph associated with the drug in each of the at least one first item, the knowledge-graph comprising a vector representation of at least one entity associated with the drug; inputting the vector representation of the at least one first entry and the at least one entity associated with the drug into a first encoder to obtain a first entry enhancement vector representation; inputting the at least one second entry into a second encoder to obtain a second entry vector representation; calculating a mutual attention representation between the first item enhancement vector representation and the second item vector representation; inputting the mutual attention representation into a multi-layer sensor to obtain a medicine effect prediction result; and determining target loss of the drug effect prediction model based on the drug effect prediction result and the real label, carrying out iterative updating on parameters of the drug effect prediction model so that the target loss meets preset conditions, and determining the drug effect prediction model after updating the parameters as the drug effect prediction model.

In some embodiments, inputting the vector representation of the at least one first entry and the at least one entity associated with the drug into the first encoder to obtain the first entry enhancement vector representation comprises: inputting each first entry of the at least one first entry into a first sub-encoder to obtain a first entry vector representation; stitching a vector representation of at least one entity associated with the drug with the first entry vector representation, resulting in a stitched first entry vector representation; and inputting the at least one stitched first entry vector representation into a second sub-encoder, the second sub-encoder deriving a first entry enhancement vector representation based on the at least one stitched first entry vector representation, wherein the first and second sub-encoders are included in the first encoder, the at least one entity associated with the drug comprising at least one disease entity associated with the drug and other drug entities similar to the drug.

According to a third aspect of the present invention, there is provided a training device of a drug effect prediction model, comprising: an input module configured to input a dataset into the drug effect prediction model, the dataset comprising a plurality of samples, each sample comprising at least one first entry, at least one second entry, and a true label for the sample; an acquisition module configured to acquire a knowledge-graph associated with a drug in each of the at least one first item, the knowledge-graph comprising a vector representation of at least one entity associated with the drug; a first vector representation module configured to input a vector representation of the at least one first entry and at least one entity associated with the drug into a first encoder to obtain a first entry enhancement vector representation; a second vector representation module configured to input the at least one second entry into a second encoder to obtain a second entry vector representation; a mutual attention module configured to calculate a mutual attention representation between the first item enhancement vector representation and the second item vector representation; an effect prediction module configured to input the mutual attention representation into a multi-layer perceptron to obtain a drug effect prediction result; and the iteration module is configured to determine the target loss of the drug effect prediction model based on the drug effect prediction result and the real label, iteratively update the parameters of the drug effect prediction model to enable the target loss to meet the preset condition, and determine the drug effect prediction model after updating the parameters as the drug effect prediction model.

According to a fourth aspect of the present invention, there is provided a drug effect prediction apparatus comprising: an input module configured to input a first entry including at least the drug name and a method of administering the drug and a second entry including at least one diagnostic disease information into a drug effect prediction model for predicting a drug effect; an output module configured to output a drug effect prediction result; wherein the drug effect prediction model is trained based on the steps of: inputting a dataset into the drug effect prediction model, the dataset comprising a plurality of samples, each sample comprising at least one first entry, at least one second entry, and a true label for the sample; obtaining a knowledge-graph associated with the drug in each of the at least one first item, the knowledge-graph comprising a vector representation of at least one entity associated with the drug; inputting the vector representation of the at least one first entry and the at least one entity associated with the drug into a first encoder to obtain a first entry enhancement vector representation; inputting the at least one second entry into a second encoder to obtain a second entry vector representation; calculating a mutual attention representation between the first item enhancement vector representation and the second item vector representation; inputting the mutual attention representation into a multi-layer sensor to obtain a medicine effect prediction result; and determining target loss of the drug effect prediction model based on the drug effect prediction result and the real label, carrying out iterative updating on parameters of the drug effect prediction model so that the target loss meets preset conditions, and determining the drug effect prediction model after updating the parameters as the drug effect prediction model.

According to yet another aspect of the present disclosure, there is provided a computing device comprising: a memory configured to store computer-executable instructions; a processor configured to perform any of the methods provided according to the foregoing aspects of the present disclosure when the computer-executable instructions are executed by the processor.

According to yet another aspect of the present disclosure, there is provided a computer-readable storage medium storing computer-executable instructions that, when executed, perform any of the methods provided according to the foregoing aspects of the present disclosure.

The drug effect prediction model training method and the drug effect prediction method provided according to the present disclosure predict the drug effectiveness (e.g., the drug effectiveness for the first 24 hours) of the ICU of a hospital or intensive care unit based on patient information. Specifically, the external knowledge of the medicine involved in the knowledge graph integration is integrated into the data-driven prediction, so that the reliability of the prediction result is provided. Interactions between the diagnosis and the drug are determined using a mutual attention module. To alleviate the data imbalance problem, contrast loss and cross entropy loss are added to the loss function, so that the model can adapt to more types of diseases. The extensive experimental results on the public data set show that the drug effect prediction model training method and the drug effect prediction method provided by the invention are obviously superior to the related existing methods. By means of the effectiveness prediction of the medical effect of the intensive care unit in hospitals and/or severe cases, doctors are helped to make more reasonable medication plans, treatment is improved, and trial-and-error procedures and ineffective treatment of trial and error, observation and adjustment are effectively avoided. The dosage is reasonable, adverse reactions of the medicine are avoided, and the death rate is reduced.

These and other aspects of the disclosure will be apparent from and elucidated with reference to the embodiments described hereinafter.

Drawings

Further details, features and advantages of the technical solutions of the present disclosure are disclosed in the following description of exemplary embodiments with reference to the attached drawings, in which:

FIG. 1 illustrates an example scenario in which a technical solution provided according to some embodiments of the present disclosure may be applied;

FIG. 2 illustrates a schematic diagram of a knowledge graph associated with a drug;

FIG. 3 illustrates a block diagram of a drug effect prediction model, according to some embodiments of the present disclosure;

FIG. 4 illustrates a flowchart of a training method of a drug effect prediction model, according to some embodiments of the present disclosure;

FIG. 5 illustrates a flowchart of a drug effect prediction method according to some embodiments of the present disclosure;

FIG. 6 illustrates a block diagram of an apparatus of a drug effect prediction model, according to some embodiments of the present disclosure;

FIG. 7 illustrates a block diagram of an apparatus of a drug effect prediction model, according to some embodiments of the present disclosure;

FIG. 8 illustrates an example system including an example computing device that represents at least one system and/or device that can implement the various techniques described herein.

Detailed Description

Several embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings in order to enable those skilled in the art to practice the technical solutions of the present disclosure. The technical solutions of the present disclosure may be embodied in many different forms and objects and should not be limited to the embodiments set forth herein. These embodiments are provided so that this disclosure will be thorough and complete, and should not be construed as limiting the scope of the disclosure.

Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and/or the present specification and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

Before describing in detail embodiments of the present disclosure, some related concepts will be explained first.

1. BERT（Bidirectional Encoder Representation from Transformers): the encoder of the bi-directional transducer is a pre-trained language model that can help downstream natural language processing tasks to improve performance for word or sentence embedding.

2. APACHE（Acute Physiology, Age, and Chronic Health Evaluation score): acute physiology, age, and chronic health assessment scores. It is a disease severity grading system, is an ICU of intensive care unitIntensive Care Unit) Is provided.

3. MLP（Multilayer Perceptron): the multilayer perceptron is an artificial neural network with a forward structure, and maps a group of input vectors to a group of output vectors. The MLP can be seen as a directed graph, consisting of multiple layers of nodes, each layer being fully connected to the next. Except for the input nodes, each node is a neuron (or processing unit) with a nonlinear activation function.

In the related art for predicting drug effects, a text-based encoder is used to learn a representation of a single medical record. This approach ignores the correlation between medical records and diagnosis and fails to account for the final drug effect prediction results. In another related art, a fixed effect model is used to model the association between patient vital signs and drug use, and introduce personal information for multiple patients and categories of patients. This approach using a fixed effect model can only predict a specific drug, a specific disease, or a specific test, requires specialized medical knowledge, and is difficult to extend to general drugs and diseases in hospitals and ICU settings.

Fig. 1 schematically illustrates an example scenario 100 in which a technical solution provided according to some embodiments of the present disclosure may be applied. As shown in fig. 1, scenario 100 may include a user 110, a terminal device 120 (e.g., a computer), a terminal device 130 (e.g., a tablet), a network 140, and a remote facility 150. By way of example, the remote facility 150 includes a server 151 and optionally a database device 152 for storing relevant data, which servers or devices may communicate via the network 140.

The service provider may provide a variety of services to the user 110 through the remote facility 150. Taking the SaaS service as an example, in this case, the server 151 in the remote facility 150 may act as a cloud server, and the user 110 uses a corresponding program (e.g., a Web portal supported by the SaaS service) deployed on the terminal device 120 to perform related work. In the process, the terminal device 120 communicates with the terminal facility 150 via the network 140 to upload and acquire related data.

On the side of the terminal device 120, for predicting the effect of the drug, first, a first entry including at least the name of the drug and the administration method of the drug and a second entry including at least one diagnosis may be acquired.

A first entry comprising at least the drug name and the method of administration of the drug and a second entry comprising at least one diagnosis are entered into a drug effect prediction model stored in the remote facility 150, the drug effect prediction model being used to predict a drug effect. Here, the drug effect prediction model is pre-trained. Specifically, a dataset from server 151 is entered into the drug effect prediction model, the dataset comprising a plurality of samples, each sample comprising at least a first entry, a second entry, and a true label for that sample; obtaining a knowledge-graph associated with the drug in each of the at least one first item, the knowledge-graph comprising a vector representation of at least one entity associated with the drug; inputting the vector representations of the at least one first entry and the at least one disease associated with the drug and the vector representations of other drugs similar to the drug into a first encoder to obtain a first entry enhancement vector representation; inputting the second entry into a second encoder to obtain a second entry vector representation; calculating a mutual attention representation between the first item enhancement vector representation and the second item vector representation; inputting the mutual attention representation into a multi-layer sensor to obtain a medicine effect prediction result; and determining the target loss of the drug effect prediction model based on the drug effect prediction result and the real label, performing iterative updating on parameters of the drug effect prediction model to enable the target loss to meet preset conditions, and determining the drug effect prediction model after updating the parameters as a drug effect prediction model.

In the present disclosure, the server 151 in the remote facility 150 may be a single server or a server cluster, and the database device 152 in the remote facility 150 may store various data required to identify an abnormality existing in the actual execution of the target service. Illustratively, the user 110 may access the remote facility 150 via the terminal device 120 or the terminal device 130 in a web page. Alternatively, the user may communicate with the remote facility 150 through a client installed on the terminal device 120 or the terminal device 130 to complete the corresponding target service. Alternatively, the server 151 may also run other applications and store other data. For example, the server 151 may include multiple virtual hosts to run different applications and provide different services.

In the present disclosure, the terminal devices 120 and 130 may be various types of devices, such as mobile phones, tablet computers, notebook computers, in-vehicle devices, and the like. The terminal devices 120 and 130 may have disposed thereon a client that may be used to perform task related operations (e.g., initiate tasks, specify tasks), and optionally provide other services, and may take any of the following forms: locally installed applications, applets accessed via other applications, web programs accessed via a browser, etc. User 110 may view information presented by clients and perform corresponding interactions through the input/output interfaces of terminal devices 120 and 130. Alternatively, the terminal devices 120 and 130 may be integrated with the server 151.

In the present disclosure, the database device 152 may be considered as an electronic file cabinet, i.e. a place where electronic files are stored, and a user may perform operations such as adding, querying, updating, deleting, etc. on data in the files. A "database" is a collection of data stored together in a manner that can be shared with multiple objects, with as little redundancy as possible, independent of the application.

Further, in the present disclosure, the network 140 may be a wired network connected via a cable such as a cable, an optical fiber, or the like, or may be a wireless network such as 2G, 3G, 4G, 5G, wi-Fi, bluetooth, zigBee, li-Fi, or the like.

It should be noted that, as used herein, the term "user" refers to any party that can interact data with a service system (e.g., a client system corresponding to a target business) corresponding to a service provider, including but not limited to, people, program software, network platforms, and even machines.

Fig. 2 schematically shows a schematic diagram of a knowledge-graph 200 associated with a drug. The knowledge-graph 200 includes at least one entity associated with the drug. In one example, the at least one entity associated with the drug includes at least one disease entity associated with the drug and other drug entities similar to the drug. Here, the structure of the knowledge-graph is fixed, i.e. the nodes and edges in the knowledge-graph are fixed. Similar other drug entities refer to other drug entities that are connected to the drug entity with edges in the knowledge-graph. Similar pharmaceutical entities generally have similar compositions and similar functions, with similar therapeutic effects for the same disease. In fig. 2, the drug is illustrated as nitroglycerin, the associated disease entities are represented by a grid pattern, including e.g. spasticity and hypertensive encephalopathy, and the associated drug name entities are represented by a diagonal line pattern, including e.g. nitrogenous compounds, diltiazem and labetalol. As will be appreciated by those skilled in the art, the disease entities and drug name entities herein are merely examples. In one example, the knowledge-graph is from a common dataset. The set of entities in the knowledge-graph associated with the drug is denoted { drug _e }. Pre-preparationAnd learning the knowledge graph to obtain the embedded representation e of each entity. The medicine is marked asc _i Using average poolingPoolingThe operation is performed to obtain the representation of the medicine based on the knowledge graphc _i,drug ，Wherein the method comprises the steps ofc _i,drug =Pooling(e)。

Fig. 3 schematically illustrates a block diagram of a drug effect prediction model 300 according to some embodiments of the present disclosure. The drug effect prediction model 300 includes a structure for performing knowledge-graph external information integration, drug treatment record representation, drug and diagnosis combination, and prediction and model optimization of drug effects. These structures are described in further detail below:

first, a diagnosis and treatment record is input. The medical records include entries such as medication names, doses administered, infusion rates, etc. The entered diagnostic records here include one or more, each diagnostic record including k words. Taking the ith medication record as an example, the words in this record can be noted as. Where they are entered into word encoder 301, word encoder 301 is used to obtain word representations in medication recordsc _i,token 。/>) WhereinTokenEncoderIs word encoder 301. Word encoder 301 may employ any timing encoder, such as BERT. Also shown schematically in fig. 3 is the (i+1) th medication record, and similarly, the k words contained in the (i+1) th medication record are entered into the word encoder.

Next step is to integrate the knowledge graph external information. For the firstiThe medicine names contained in the diagnosis and treatment records are acquired, the knowledge graph related to the medicine names is acquired, and the entity set in the knowledge graph related to the medicine is obtained and recorded as {drug _e }. Learning the knowledge graph in advance to obtain embedded representation of each entitye. The medicine is marked asc _i Using average poolingPoolingOperation ofObtaining the representation of the medicine based on the knowledge graphc _i,drug Whereinc _i,drug =Pooling(e). Splicing the word representation of the medication record with the representation of the external knowledge extension based on the knowledge graph so thatWherein->Is a splicing operation.

There are often multiple medication records for the same patient, such as the ith record and the (i+1) th record shown in fig. 3. Performing the above operation for each medication record, and representing the spliced medication record vectorInput sequence encoder 302 to learn a hospitalizationNThe overall vector representation of the individual drug records:c=SeqEncoder(c ₁ ,c ₂ , …,c _N )，Nis a natural number of the Chinese characters,SeqEncoderis a sequence encoder 302. The sequence encoder 302 is used here to characterize the timing relationship between medication therapy records,SeqEncoderbut may also be any timing encoder such as BERT.

Next, drug recording is combined with diagnosis. Typically, the actual and symptomatic administration of the patient is considered when medical personnel make medication recommendations. It is also desirable to fuse existing diagnostic information when predicting drug effects herein. First, the diagnosis encoder 303 is used to make the hospital stay this timeL（LPositive integer) of diagnosesx ₁ , …,x _L The encoding is performed such that,DXEncoderin order to diagnose the encoder 303,x=DXEncoder([x ₁ , …,x _L ]). And word encoderTokenEncoder301 and sequence encoderSeqEncoder302, similar, diagnostic encoderDXEncoder303 may use any encoder such as BERT.

Calculating a drug recordc=SeqEncoder(c ₁ ,c ₂ , …,c _N ) And diagnosisx=DXEncoder([x ₁ , …,x _L ]) The mutual attention is expressed as the result of the drug effect prediction obtained by inputting the mutual attention into the multi-layer sensor. The relationship between diagnosis and drug is characterized using an attention mechanism, which can be represented by the following formula:

（1）

wherein the method comprises the steps ofRepresenting query value query->For key, ++>Is a value->Representing the embedding dimension. Embedding dimension->Usually an empirical value, < > in one example>768 a. As will be appreciated by the person skilled in the art, +.>Any other suitable value may also be used. Substituting the representation of the record of the medication therapy and the representation of the diagnosis into formula (1), respectively, to obtain a diagnostic-based attention representation of the medication therapy >Drug therapy based representation of diagnosis->：

（2）

（3）

And finally, predicting the drug effect and optimizing the model. Calculation of drugs to be passed through attention moduleDiagnosis->And carrying out average pooling. />. Splicing the two to obtain ++>，/>. Vector to splice->Input multilayer perceptron (MLP)>In (3) obtaining a drug effect prediction tag->。/>。

The loss function uses a combination of both cross entropy functions and contrast losses to optimize the classification performance of the model. The cross entropy function may be defined as follows:

（4）

wherein the method comprises the steps ofFor the size of the data of a training batch, +.>Is->True tags of data samples.

There is a problem of data imbalance in the prior art, which refers to the problem of a large difference in the data set between the data volumes with survival and death signatures for mortality (e.g., 1 for the number of patients survived and 10 for the number of patients dead). To make efficient use of the dataset and eliminate the problem of data imbalance, a contrast learning penalty is used to better learn sample features. The purpose of contrast learning is to aggregate the features of samples of the same class while distinguishing features of samples of different classes. In the same training batch, suppose focus on the first The same batch of samples can be divided into +.>And->Two collections, wherein->Refer to and sample->Label consistent sample, ++>For samples of label inconsistencies, the contrast learning penalty may be defined as follows:

（5）

wherein the method comprises the steps ofFor the temperature super-parameter, according to empirical values, suitable +.>The convergence rate of the model can be increased. Here will->Set to 0.07. As will be appreciated by the person skilled in the art, +.>But may be any other value.

The final loss function of the model is defined as follows:

（6）

wherein the method comprises the steps ofTo balance the parameters of both, +.>Associated with training rounds, i.e.)>，/>For the current training round, ++>Is the total round. The purpose of this is to make the model learn better features (the #)>) Then the optimization classifier is again focused on (+.>）。

Fig. 4 schematically illustrates a flowchart of a training method 400 of a drug effect prediction model according to some embodiments of the present disclosure.

In step 401, a dataset is entered into the drug effect prediction model, the dataset comprising a plurality of samples, each sample comprising at least a first entry, a second entry, and a true label for the sample. Specifically, the first item is a diagnosis record. The medical records include entries such as medication names, doses administered, infusion rates, etc. The entered diagnostic records here include one or more, each diagnostic record including k words. Taking the ith medication record as an example, the words in this record can be noted as . The second item is the hospitalizationL（LPositive integer) number of diagnoses->。

In step 402, a knowledge-graph associated with the drug in each of the at least one first item is obtained, the knowledge-graph comprising a vector representation of at least one entity associated with the drug. The at least one entity associated with the drug includes at least one disease entity associated with the drug and other drug entities similar to the drug. Taking the example that the drug is nitroglycerin, the associated disease entities are represented by a grid pattern, including e.g. spasticity and hypertensive encephalopathy, and the associated drug name entities are represented by a diagonal line pattern, including e.g. nitrogenous compounds, diltiazem and labetalol. As will be appreciated by those skilled in the art, the disease entities and drug name entities herein are merely examples. In one example, the knowledge-graph is from a common dataset. The set of entities in the knowledge-graph associated with the drug is denoted {drug _e }. Learning the knowledge graph in advance to obtain embedded representation of each entitye。

In step 403, a vector representation of the at least one first entry and at least one entity associated with the drug is input to a first encoder to obtain a first entry enhancement vector representation. Specifically, first will In the input word encoder, word representation in medication record is obtained by using the word encoderc _i,token 。) WhereinTokenEncoderIs a word encoder. The word encoder may employ any timing encoder, such as BERT. For the firstiThe medicine names contained in the diagnosis and treatment records are acquired, the knowledge graph related to the medicine names is acquired, and the entity set in the knowledge graph related to the medicine is obtained and recorded as {drug _e }. Learning the knowledge graph in advance to obtain embedded representation of each entitye. The medicine is marked asc _i Using average poolingPoolingThe operation is performed to obtain the representation of the medicine based on the knowledge graphc _i,drug Whereinc _i,drug =Pooling(e). Splicing word representation of the medication record and representation of external knowledge expansion based on a knowledge graph to obtain a first item enhancement vector representationWherein->Is a splicing operation.

In step 404, the second entry is input to a second encoder to obtain the second entry vector representation. For the hospitalizationL（LPositive integer) of diagnosesx ₁ , …,x _L The encoding is performed such that,DXEncoderfor diagnosing the encoder, the second entry vector representsx=DXEncoder([x ₁ , …,x _L ]). Diagnosis, similar to word encoder and sequence encoderThe encoder may use any encoder such as BERT.

In step 405, a mutual attention representation between the first item enhancement vector representation and the second item vector representation is calculated. First, the diagnosis encoder 303 is used to make the hospital stay this time L（LPositive integer) of diagnosesx ₁ , …,x _L The encoding is performed such that,DXEncoderin order to diagnose the encoder 303,x=DXEncoder([x ₁ , …,x _L ]). And word encoderTokenEncoder301 and sequence encoderSeqEncoder302, similar, diagnostic encoderDXEncoder303 may use any encoder such as BERT.

Calculating a drug recordAnd diagnosisx=DXEncoder([x ₁ , …,x _L ]) Mutual attention between them. The relationship between diagnosis and drug is characterized using an attention mechanism, which can be represented by the following formula:

（1）

wherein the method comprises the steps ofRepresenting query value query->For key, ++>Is a value->Representing the embedding dimension. Embedding dimension->Usually an empirical value, < > in one example>768 a. As will be appreciated by the person skilled in the art, +.>Any other suitable value may also be used. Substituting the representation of the record of the medication therapy and the representation of the diagnosis into formula (1), respectively, to obtain a diagnostic-based attention representation of the medication therapy>Drug therapy based representation of diagnosis->：

（2）

（3）

In step 406, the mutual attention representation is input to the multi-layer perceptron to obtain a predicted outcome of the drug effect. Calculation of drugs to be passed through attention moduleDiagnosis->And carrying out average pooling. />. Splicing the two to obtain ++>，/>. Vector to splice->Input multi-layer sensorMLP）/>In (3) obtaining a drug effect prediction tag- >。/>。

In step 407, determining a target loss of the drug effect prediction model based on the drug effect prediction result and the real label, performing iterative update on parameters of the drug effect prediction model to enable the target loss to meet a preset condition, and determining the drug effect prediction model after updating the parameters as the drug effect prediction model. The target loss of the model includes a first loss and a second loss. The first penalty is a cross entropy function between the drug effect prediction result and the real label. The cross entropy function may be defined as follows:

（4）

There is a problem of data imbalance in the prior art, which refers to the problem of a large difference in the data set between the data volumes with survival and death signatures for mortality (e.g., 1 for the number of patients survived and 10 for the number of patients dead). To make efficient use of the dataset and eliminate the problem of data imbalance, a second loss, the contrast learning loss, is used to better learn the sample features. Second loss ofIs calculated based on the following steps: comparing the drug effect prediction result with the real label, and dividing the sample into a first set and a second set based on whether the drug effect prediction result is consistent with the real label; and performing contrast learning on the first set and the second set, and calculating the second loss based on contrast learning loss of the contrast learning. In the same training batch, suppose focus on the first The same batch of samples can be divided into +.>And->Two collections, wherein->Refer to and sample->Label consistent sample, ++>For samples of label inconsistencies, the contrast learning penalty may be defined as follows:

（5）

wherein the method comprises the steps ofFor the temperature super-parameter, according to empirical values, suitable +.>The convergence rate of the model can be increased. Here will->Set to 0.07. The final loss function of the model is defined as follows:

（6）

Fig. 5 schematically illustrates a flow chart of a drug effect prediction method 500 according to some embodiments of the present disclosure. In step 501, a first entry including at least the drug name and the method of administration of the drug and a second entry including at least one diagnosis are entered into a drug effect prediction model for predicting a drug effect. Next, in step 502, a drug effect prediction result is output. Specifically, the first item is a diagnosis record. The medical records include entries such as medication names, doses administered, infusion rates, etc. The entered diagnostic records here include one or more, each diagnostic record including k words. Taking the ith medication record as an example, the words in this record can be noted as . The second item is the hospitalizationL（LPositive integer) number of diagnoses->。

Fig. 6 schematically illustrates an example block diagram of a training apparatus 600 of a drug effect prediction model according to some embodiments of the disclosure. Illustratively, the training apparatus 600 of the drug effect prediction model may be deployed on the terminal device 120 or 130 shown in fig. 1. As shown in fig. 6, a training apparatus 600 of a drug effect prediction model includes an input module 601, an acquisition module 602, a first vector representation module 603, a second vector representation module 604, a mutual attention module 605, an effect prediction module 606, and an iteration module 607.

In particular, the input module 601 may be configured to input a dataset into the drug effect prediction model, the dataset comprising a plurality of samples, each sample comprising at least one first entry, at least one second entry, and a true label for the sample; the acquisition module 602 may be configured to acquire a knowledge-graph associated with the drug in each of the at least one first entry, the knowledge-graph comprising a vector representation of at least one entity associated with the drug; the first vector representation module 603 is configured to input a vector representation of the at least one first entry and at least one entity associated with the drug into the first encoder to obtain a first entry enhancement vector representation; the second vector representation module 604 may be configured to input the at least one second entry into a second encoder to obtain a second entry vector representation; the mutual attention module 605 may be configured to calculate a mutual attention representation between the first item enhancement vector representation and the second item vector representation; the effect prediction module 606 may input the mutual attention representation into a multi-layer perceptron to obtain a drug effect prediction result; and the iteration module 607 may determine a target loss of the drug effect prediction model based on the drug effect prediction result and the real label, iteratively update parameters of the drug effect prediction model to make the target loss meet a preset condition, and determine the drug effect prediction model after updating the parameters as the drug effect prediction model.

It should be appreciated that the training apparatus 600 of the drug effect prediction model may be implemented in software, hardware, or a combination of software and hardware, that a plurality of different modules in the apparatus may be implemented in the same software or hardware structure, or that one module may be implemented by a plurality of different software or hardware structures.

In addition, the training device 600 for the drug effect prediction model may be used to implement the training method 300 for the drug effect prediction model described above, and the details thereof are described in detail above, so that the details are not repeated here for brevity. In addition, these devices may have the same features and advantages as described for the corresponding methods.

Fig. 7 schematically illustrates a block diagram of a drug effect prediction apparatus 700 according to some embodiments of the present disclosure. Illustratively, the drug effect prediction apparatus 700 may be deployed on the terminal device 120 or 130 shown in fig. 1. As shown in fig. 7, the drug effect prediction apparatus 700 includes an input module 701 and an output module 702.

In particular, the input module 701 may be configured to input a first entry including at least the drug name and the method of administration of the drug and a second entry including at least one diagnostic disease information into a drug effect prediction model for predicting a drug effect; the output module 702 may be configured to output a drug effect prediction result. Wherein the drug effect prediction model is trained based on the steps of: inputting a dataset into the drug effect prediction model, the dataset comprising a plurality of samples, each sample comprising at least one first entry, at least one second entry, and a true label for the sample; obtaining a knowledge-graph associated with the drug in each of the at least one first item, the knowledge-graph comprising a vector representation of at least one entity associated with the drug; inputting the vector representation of the at least one first entry and the at least one entity associated with the drug into a first encoder to obtain a first entry enhancement vector representation; inputting the at least one second entry into a second encoder to obtain a second entry vector representation; calculating a mutual attention representation between the first item enhancement vector representation and the second item vector representation; inputting the mutual attention representation into a multi-layer sensor to obtain a medicine effect prediction result; and determining target loss of the drug effect prediction model based on the drug effect prediction result and the real label, carrying out iterative updating on parameters of the drug effect prediction model so that the target loss meets preset conditions, and determining the drug effect prediction model after updating the parameters as the drug effect prediction model.

It should be appreciated that the drug effect prediction apparatus 700 may be implemented in software, hardware, or a combination of software and hardware, that a plurality of different modules in the apparatus may be implemented in the same software or hardware structure, or that a module may be implemented by a plurality of different software or hardware structures.

In addition, the drug effect prediction apparatus 700 may be used to implement the drug effect prediction method 400 described above, and the details thereof are described in detail above, and for brevity, will not be repeated here. In addition, these devices may have the same features and advantages as described for the corresponding methods.

FIG. 8 illustrates an example system including an example computing device 800 that represents at least one system and/or device in which the various techniques described herein may be implemented. Computing device 800 may be, for example, a server used by a node in a blockchain, a device associated with a server, a system-on-chip, and/or any other suitable computing device or computing system. The training apparatus 600 and the drug effect prediction apparatus 700 of the drug effect prediction model described above with reference to fig. 8 may take the form of a computing device 800. Alternatively, the training apparatus 600 and the drug effect prediction apparatus 700 of the drug effect prediction model may be implemented as computer programs in the form of the application 816.

The example computing device 800 as illustrated in fig. 8 includes a processing system 811, at least one computer-readable medium 812, and at least one I/O interface 813 communicatively coupled to each other. Although not shown, computing device 800 may also include a system bus or other data and command transfer system that couples the various components to one another. A system bus may include any one or combination of different bus structures, such as a memory bus or memory controller, a peripheral bus, a universal serial bus, and/or a processor or local bus that utilizes any of a variety of bus architectures. Various other examples are also contemplated, such as control and data lines.

The processing system 811 is representative of functionality that performs at least one operation using hardware. Thus, the processing system 811 is illustrated as including hardware elements 814 that may be configured as processors, functional blocks, and the like. This may include implementation in hardware as an application specific integrated circuit or other logic device formed using at least one semiconductor. The hardware element 814 is not limited by the materials from which it is formed or the processing mechanisms employed therein. For example, the processor may be comprised of semiconductor(s) and/or transistors (e.g., electronic Integrated Circuits (ICs)). In such a context, the processor-executable instructions may be electronically-executable instructions.

Computer-readable medium 812 is illustrated as including memory/storage 815. Memory/storage 815 represents memory/storage capacity associated with at least one computer-readable medium. Memory/storage 815 may include volatile media (such as Random Access Memory (RAM)) and/or nonvolatile media (such as Read Only Memory (ROM), flash memory, optical disks, magnetic disks, and so forth). Memory/storage 815 may include fixed media (e.g., RAM, ROM, a fixed hard drive, etc.) and removable media (e.g., flash memory, a removable hard drive, an optical disk, etc.). The computer-readable medium 812 may be configured in a variety of other ways as described further below.

At least one I/O interface 813 represents functionality that allows a user to input commands and information to computing device 800 using various input devices, and optionally also allows information to be presented to the user and/or other components or devices using various output devices. Examples of input devices include keyboards, cursor control devices (e.g., mice), microphones (e.g., for voice input), scanners, touch functions (e.g., capacitive or other sensors configured to detect physical touches), cameras (e.g., motion that does not involve touches may be detected as gestures using visible or invisible wavelengths such as infrared frequencies), and so forth. Examples of output devices include a display device (e.g., projector), speakers, printer, network card, haptic response device, and so forth. Accordingly, computing device 800 may be configured in a variety of ways to support user interaction as described further below.

Computing device 800 also includes applications 816. The application 816 may be, for example, a software instance of the training apparatus 600 and the drug effect prediction apparatus 700 of the drug effect prediction model, and implement the techniques described herein in combination with other elements in the computing device 800.

To verify the effectiveness of the model, we define three classification tasks, i.e. predicting hospitalization mortality, predicting whether hospitalization duration is greater than a predetermined number of days, predicting the APACHE score, to measure the effect of the drug. The experimental results are shown in Table 1. From the table, we can see that the model proposed in the mirco F1 and Marco F1 methods is superior to all the baseline methods. The prediction result comprises: mortality in hospital (mortalities), length of stay in hospital (LOS 7), and APACHE.

Table 1 Experimental results of the predictive task

Therefore, the drug effect prediction model training method and the drug effect prediction method provided by the invention are obviously superior to the related existing methods. By means of the effectiveness prediction of the medical effect of the intensive care unit in hospitals and/or severe cases, doctors are helped to make more reasonable medication plans, treatment is improved, and trial-and-error procedures and ineffective treatment of trial and error, observation and adjustment are effectively avoided. The dosage is reasonable, adverse reactions of the medicine are avoided, and the death rate is reduced.

Various techniques may be described herein in the general context of software hardware elements or program modules. Generally, these modules include routines, programs, elements, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The terms "module," "functionality," and "component" as used herein generally represent software, firmware, hardware, or a combination thereof. The features of the techniques described herein are platform-independent, meaning that the techniques may be implemented on a variety of computing platforms having a variety of processors.

An implementation of the described modules and techniques may be stored on or transmitted across some form of computer readable media. Computer readable media can include a variety of media that are accessible by computing device 800. By way of example, and not limitation, computer readable media may comprise "computer readable storage media" and "computer readable signal media".

"computer-readable storage medium" refers to a medium and/or device that can permanently store information and/or a tangible storage device, as opposed to a mere signal transmission, carrier wave, or signal itself. Thus, computer-readable storage media refers to non-signal bearing media. Computer-readable storage media include hardware such as volatile and nonvolatile, removable and non-removable media and/or storage devices implemented in methods or techniques suitable for storage of information such as computer-readable instructions, data structures, program modules, logic elements/circuits or other data. Examples of a computer-readable storage medium may include, but are not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical storage, hard disk, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or other storage devices, tangible media, or articles of manufacture adapted to store the desired information and which may be accessed by a computer.

"computer-readable signal medium" refers to a signal bearing medium configured to transmit instructions to hardware of computing device 800, such as via a network. Signal media may typically be embodied in a modulated data signal, such as a carrier wave, data signal, or other transport mechanism, with computer readable instructions, data structures, program modules, or other data. Signal media also include any information delivery media. The term "modulated data signal" means a signal that has at least one of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media.

As previously described, the hardware elements 814 and computer-readable media 812 represent instructions, modules, programmable device logic, and/or fixed device logic implemented in hardware that, in some embodiments, may be used to implement at least some aspects of the techniques described herein. The hardware elements may include integrated circuits or components of a system on a chip, application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs), complex Programmable Logic Devices (CPLDs), and other implementations in silicon or other hardware devices. In this context, the hardware elements may be implemented as processing devices that perform program tasks defined by instructions, modules, and/or logic embodied by the hardware elements, as well as hardware devices that store instructions for execution, such as the previously described computer-readable storage media.

Combinations of the foregoing may also be used to implement the various techniques and modules described herein. Accordingly, software, hardware, or program modules, and other program modules may be implemented as at least one instruction and/or logic embodied on some form of computer readable storage medium and/or by at least one hardware element 814. Computing device 800 may be configured to implement particular instructions and/or functions corresponding to software and/or hardware modules. Thus, for example, by using the computer-readable storage medium of the processing system and/or the hardware element 814, a module may be implemented at least in part in hardware as a module executable by the computing device 800 as software. The instructions and/or functions may be executable/operable by at least one article of manufacture (e.g., at least one computing device 800 and/or processing system 811) to implement the techniques, modules, and examples described herein.

In various implementations, computing device 800 may take on a variety of different configurations. For example, computing device 800 may be implemented as a computer-like device including a personal computer, desktop computer, multi-screen computer, laptop computer, netbook, and the like. Computing device 800 may also be implemented as a mobile appliance-like device that includes mobile devices such as mobile telephones, portable music players, portable gaming devices, tablet computers, multi-screen computers, and the like. Computing device 800 may also be implemented as a television-like device that includes devices having or connected to generally larger screens in casual viewing environments. Such devices include televisions, set-top boxes, gaming machines, and the like.

The techniques described herein may be supported by these various configurations of computing device 800 and are not limited to the specific examples of techniques described herein. The functionality may also be implemented in whole or in part on the "cloud" 820 through the use of a distributed system, such as through platform 822 as described below.

Cloud 820 includes and/or is representative of platform 822 for resource 824. Platform 822 abstracts underlying functionality of hardware (e.g., servers) and software resources of cloud 820. The resources 824 may include applications and/or data that can be used when executing computer processing on servers remote from the computing device 800. The resources 824 may also include services provided over the internet and/or over subscriber networks such as cellular or Wi-Fi networks.

Platform 822 may abstract resources and functionality to connect computing device 800 with other computing devices. Platform 822 may also be used to abstract a hierarchy of resources to provide a corresponding level of hierarchy of requirements encountered for resources 824 implemented via platform 822. Thus, in an interconnected device embodiment, the implementation of the functionality described herein may be distributed throughout the system 800. For example, the functionality may be implemented in part on computing device 800 and by platform 822 abstracting the functionality of cloud 820.

It should be understood that for clarity, embodiments of the present disclosure have been described with reference to different functional units. However, it will be apparent that the functionality of each functional unit may be implemented in a single unit, in a plurality of units or as part of other functional units without departing from the present disclosure. For example, functionality illustrated to be performed by a single unit may be performed by multiple different units. Thus, references to specific functional units are only to be seen as references to suitable units for providing the described functionality rather than indicative of a strict logical or physical structure or organization. Thus, the present disclosure may be implemented in a single unit or may be physically and functionally distributed between different units and circuits.

It will be understood that, although the terms first, second, third, etc. may be used herein to describe various devices, elements, components or sections, these devices, elements, components or sections should not be limited by these terms. These terms are only used to distinguish one device, element, component, or section from another device, element, component, or section.

Although the present disclosure has been described in connection with some embodiments, it is not intended to be limited to the specific form set forth herein. Rather, the scope of the present disclosure is limited only by the appended claims. Additionally, although individual features may be included in different claims, these may possibly be advantageously combined, and the inclusion in different claims does not imply that a combination of features is not feasible and/or advantageous. The order of features in the claims does not imply any specific order in which the features must be worked. Furthermore, in the claims, the word "comprising" does not exclude other elements, and the term "a" or "an" does not exclude a plurality. Reference signs in the claims are provided merely as a clarifying example and shall not be construed as limiting the scope of the claims in any way.

The present disclosure provides a computer readable storage medium having stored thereon computer readable instructions that, when executed, implement the training method and the drug effect prediction method of the drug effect prediction model described above.

The present disclosure provides a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computing device reads the computer instructions from the computer-readable storage medium and executes the computer instructions to cause the computing device to perform the training method and the drug effect prediction method of the drug effect prediction model provided in the various alternative implementations described above.

Variations to the disclosed embodiments can be understood and effected by those skilled in the art in practicing the claimed subject matter, from a study of the drawings, the disclosure, and the appended claims. In the claims, the word "comprising" does not exclude other elements or steps, and the "a" or "an" does not exclude a plurality. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.

It will be appreciated that in particular embodiments of the present disclosure, data relating to the actual execution of a target service is referred to. When the above embodiments of the present disclosure are applied to a specific product or technology, user approval or consent is required, and the collection, use and processing of relevant data is required to comply with relevant legal regulations and standards of the relevant country and region.

Claims

1. A method of training a drug effect prediction model for predicting a drug effect, comprising:

inputting a dataset into the drug effect prediction model, the dataset comprising a plurality of samples, each sample comprising at least one first entry, at least one second entry, and a true label for the sample;

obtaining a knowledge-graph associated with the drug in each of the at least one first item, the knowledge-graph comprising a vector representation of at least one entity associated with the drug;

inputting the vector representation of the at least one first entry and the at least one entity associated with the drug into a first encoder to obtain a first entry enhancement vector representation;

inputting the at least one second entry into a second encoder to obtain a second entry vector representation;

Calculating a mutual attention representation between the first item enhancement vector representation and the second item vector representation;

inputting the mutual attention representation into a multi-layer sensor to obtain a medicine effect prediction result;

and determining target loss of the drug effect prediction model based on the drug effect prediction result and the real label, carrying out iterative updating on parameters of the drug effect prediction model so that the target loss meets preset conditions, and determining the drug effect prediction model after updating the parameters as the drug effect prediction model.

2. The method of claim 1, wherein for each sample, each first entry includes at least the medication name and the method of administration of the medication, each second entry includes at least one diagnostic disease information, and the real label includes at least a probability of death.

3. The method of claim 1, wherein inputting the vector representation of the at least one first entry and the at least one entity associated with the drug into a first encoder to obtain a first entry enhancement vector representation comprises:

inputting each first entry of the at least one first entry into a first sub-encoder to obtain a first entry vector representation;

Stitching a vector representation of at least one entity associated with the drug with the first entry vector representation, resulting in a stitched first entry vector representation; and

inputting the at least one spliced first item vector representation into a second sub-encoder, the second sub-encoder deriving a first item enhancement vector representation based on the at least one spliced first item vector representation,

wherein the first and second sub-encoders are included in a first encoder, the at least one entity associated with the drug including at least one disease entity associated with the drug and other drug entities similar to the drug.

4. A method according to claim 3, wherein the vector representation of at least one entity associated with the drug is obtained at least by:

a vector representation of at least one disease associated with the drug in each first entry in the knowledge-graph and a vector representation of other drugs similar to the drug is obtained,

averaging the vector representation of the at least one disease and the vector representation of the other drug to obtain a vector representation of at least one entity associated with the drug.

5. The method of claim 1 or 2, wherein the target loss of the model comprises a first loss and a second loss,

the first loss is a cross entropy function between the drug effect prediction result and the real label;

the second loss is calculated based on the steps of:

comparing the drug effect prediction result with the real label, and dividing the sample into a first set and a second set based on whether the drug effect prediction result is consistent with the real label;

and performing contrast learning on the first set and the second set, and calculating the second loss based on contrast learning loss of the contrast learning.

6. The method of claim 3 or 4, wherein the drug effect prediction results comprise at least one of predicting mortality in hospitalization, predicting whether a length of hospitalization is greater than a predetermined number of days, and predicting a health score.

7. The method of claim 1, wherein calculating a mutual attention representation between the first item enhancement vector representation and the second item vector representation comprises:

the first entry enhancement vector representation is noted asc，The second entry vector representation is denoted as xCalculating a mutual attention representation between the first item enhancement vector representation and the second item vector representation based on the formula including a representation of the attention of the medication based on information related to diagnosing the diseaseRepresentation based on drug therapy with the diagnostic disease information：

Wherein the method comprises the steps ofAttentionAs a function of attention.

8. A method of predicting a pharmaceutical effect, comprising:

inputting a first entry comprising at least the drug name and the method of administration of the drug and a second entry comprising at least one piece of diagnostic disease information into a drug effect prediction model for predicting a drug effect;

outputting a medicine effect prediction result;

wherein the drug effect prediction model is trained based on the steps of:

9. The method of claim 8, wherein inputting the vector representation of the at least one first entry and the at least one entity associated with the drug into a first encoder to obtain a first entry enhancement vector representation comprises:

10. The method of claim 9, wherein the vector representation of the at least one entity associated with the drug is obtained by at least:

11. A training device for a drug effect prediction model for predicting a drug effect, comprising:

an input module configured to input a dataset into the drug effect prediction model, the dataset comprising a plurality of samples, each sample comprising at least one first entry, at least one second entry, and a true label for the sample;

an acquisition module configured to acquire a knowledge-graph associated with a drug in each of the at least one first item, the knowledge-graph comprising a vector representation of at least one entity associated with the drug;

a first vector representation module configured to input a vector representation of the at least one first entry and at least one entity associated with the drug into a first encoder to obtain a first entry enhancement vector representation;

a second vector representation module configured to input the at least one second entry into a second encoder to obtain a second entry vector representation;

a mutual attention module configured to calculate a mutual attention representation between the first item enhancement vector representation and the second item vector representation;

An effect prediction module configured to input the mutual attention representation into a multi-layer perceptron to obtain a drug effect prediction result;

and the iteration module is configured to determine the target loss of the drug effect prediction model based on the drug effect prediction result and the real label, iteratively update the parameters of the drug effect prediction model to enable the target loss to meet the preset condition, and determine the drug effect prediction model after updating the parameters as the drug effect prediction model.

12. A drug effect prediction device, comprising:

an input module configured to input a first entry including at least the drug name and a method of administering the drug and a second entry including at least one diagnostic disease information into a drug effect prediction model for predicting a drug effect;

an output module configured to output a drug effect prediction result;

wherein the drug effect prediction model is trained based on the steps of:

13. A computing device, comprising:

a memory configured to store computer-executable instructions;

A processor configured to perform the method according to any of claims 1-10 when the computer executable instructions are executed by the processor.

14. A computer readable storage medium storing computer executable instructions which, when executed, implement a method according to any one of claims 1-10.