CN113838583A - Intelligent drug efficacy evaluation method based on machine learning and application thereof - Google Patents

Intelligent drug efficacy evaluation method based on machine learning and application thereof Download PDF

Info

Publication number
CN113838583A
CN113838583A CN202111135248.5A CN202111135248A CN113838583A CN 113838583 A CN113838583 A CN 113838583A CN 202111135248 A CN202111135248 A CN 202111135248A CN 113838583 A CN113838583 A CN 113838583A
Authority
CN
China
Prior art keywords
medicine
ranking
symptom
evaluating
medicines
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111135248.5A
Other languages
Chinese (zh)
Other versions
CN113838583B (en
Inventor
尚磊
杨喆
张玉海
梁英
张海悦
王玥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Air Force Medical University of PLA
Original Assignee
Air Force Medical University of PLA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Air Force Medical University of PLA filed Critical Air Force Medical University of PLA
Priority to CN202111135248.5A priority Critical patent/CN113838583B/en
Publication of CN113838583A publication Critical patent/CN113838583A/en
Application granted granted Critical
Publication of CN113838583B publication Critical patent/CN113838583B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references
    • G16H70/40ICT specially adapted for the handling or processing of medical references relating to drugs, e.g. their side effects or intended usage

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Medicinal Chemistry (AREA)
  • Mathematical Physics (AREA)
  • Chemical & Material Sciences (AREA)
  • Computing Systems (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Toxicology (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention discloses an intelligent drug efficacy evaluation method based on machine learning and application thereof, wherein the method comprises the steps of establishing a mapping relation between a drug and a corresponding target treatment disease or symptom; extracting the corresponding potential side effects of the medicines, and calculating similarity indexes among the medicines; labeling the data on the medicine line to mark whether the medicine is effective or not, structuring the text data of the medicine, and extracting multi-dimensional crowd information and medicine and treatment related feature vectors; dividing the structured data into a training set and a verification set, establishing an integrated prediction model and selecting a scheme with optimal prediction effect by utilizing various algorithms and different characteristic variable selection mechanisms; and finally, obtaining the effective rate ranking of the similar medicines according to the medicine similarity index and realizing various functions of medicine curative effect evaluation through the application.

Description

Intelligent drug efficacy evaluation method based on machine learning and application thereof
Technical Field
The invention relates to the fields of biomedicine and artificial intelligence, in particular to an intelligent medicine curative effect evaluation method based on machine learning and application thereof.
Background
In the field of biopharmaceuticals, efficacy (efficacy), therapeutic effect (efffectiveness) and benefit (efficiency) are three indicators used to evaluate drugs at different times and environments. Efficacy generally refers to the magnitude of therapeutic effect that a drug can achieve under ideal conditions during clinical trials, and is the maximum desired effect of the drug. The curative effect is the magnitude of the therapeutic action which can be achieved by the medicine under the actual medical and sanitary conditions, namely the data result obtained in the real world. Benefit refers to whether the value of a drug is comparable to the cost paid by an individual or society, not only considering clinical effectiveness, but also cost benefits, which are generally used for health economics evaluations, to the public.
When a drug passes the third phase clinical trial, approved for marketing, its efficacy will be tested by real world tests. Under the real condition, the factors such as patient groups, drug dosage, use frequency and the like are much more complex compared with clinical random tests, so that the evaluation of the drug curative effect in the real world is more and more looked at, and the information extraction such as on-line drug evaluation, case reports, drug use guidelines, cautionary matters and the like can be realized by mass data mining due to the development of a big data technology.
The existing research and method for the curative effect of the medicine from the real world only aims at a single data source, for example, the curative effect of the medicine is evaluated through investigation reports, clinical follow-up visits or four-stage tests, and the information of the population which can be covered by the research and treatment method is still influenced by factors such as scientific research expenses, research scale, selective deviation and the like. The invention integrates data of different information sources by utilizing a text mining technology and an integrated machine learning algorithm, extracts effective characteristic values, establishes a set of comprehensive drug curative effect evaluation system and a decision mechanism applied by the comprehensive drug curative effect evaluation system, and realizes multiple functions of drug recommendation, curative effect and side effect evaluation, similar drug comparison and the like.
The invention can not only carry out long-term and large-scale monitoring and evaluation on the curative effect of the medicine after the medicine is on the market, but also can be further used as an important reference index for the benefit evaluation of the effectiveness and the cost price of the medicine.
Disclosure of Invention
The invention aims to provide an intelligent drug evaluation method based on machine learning and application thereof, which combines mass internet data with hospital case history list, follow-up visit or investigation report data to obtain larger-range drug use condition real-time feedback information and comprehensively evaluates drug curative effect from multiple information sources. The adverse factors such as high cost and artificial inclusion and exclusion standards caused by recruitment of subjects in the process of evaluating the curative effect of the traditional medicine after the traditional medicine is on the market are avoided, and the using curative effect and the side effect of the medicine under various conditions are evaluated more comprehensively and efficiently.
The invention provides an intelligent medicine evaluation method based on machine learning in a first aspect, which specifically comprises the following steps:
1) extracting the mapping relation between the medicine and the corresponding treatment disease or symptom through the medicine use instruction and the medicine guide of the medical supervision bureau: supposing that the medicine is
Figure 818028DEST_PATH_IMAGE001
I = 1.. I, which corresponds to a target treatment disease or symptom of I
Figure 649718DEST_PATH_IMAGE002
J = 1.. J is J of J target diseases or symptoms, with the corresponding potential side effects of J target diseases or symptoms
Figure 216965DEST_PATH_IMAGE003
K = 1,.. K is K potential side effects.
Calculate the similar drug index before the drug. In particular, suppose a drug product
Figure 984064DEST_PATH_IMAGE001
The corresponding target treatment disease or symptom is
Figure 636762DEST_PATH_IMAGE004
(ii) a Medicine and food additive
Figure 272143DEST_PATH_IMAGE005
The corresponding target treatment disease or symptom is
Figure 303684DEST_PATH_IMAGE006
(ii) a Then medicine similarity index
Figure 366318DEST_PATH_IMAGE007
2) Treating diseases or symptoms of the on-line medicine comments, the medical record sheets of the hospitals and the follow-up records according to the medicine targets in the step 1)
Figure 240733DEST_PATH_IMAGE002
Grouping, and labeling each comment and medical record sheet as 'effective' or 'ineffective' respectively.
Specifically, the labeling mode is automatic labeling, emotion analysis is performed according to semantics, a sentence is scored as a value from-1 (negative) to 1 (positive) by using a VADER (value Aware Dictionary and sEntiment reader), and 0 is a neutral opinion. Further, manual checking can be performed after automatic labeling.
3) Structuring the text data: a) extracting multi-dimensional crowd information such as age, gender, race, marriage and childbirth, region and the like, b) extracting feature vectors: extracting characteristic words or phrases such as anti-inflammation, fever, headache, cold, cough and the like from online medicine comments, medical record sheets and follow-up records to obtain characteristic vectors;
4) the text data is converted into a structured data set which is divided into a training set and a verification set according to a certain proportion.
In particular, the training set and validation set may be divided in a ratio of 8:2, 7:3, or 6: 4.
5) Various algorithms are selected as classifiers to predict the binary problem.
Specifically, four classifiers for the two-class problem may be selected: a) OneVsRest SVM, b) Logistic Regression, c) Random Forest, d) Bagging meta-estimator with Logistic Regression base.
6) And establishing different characteristic variable selection mechanisms, and selecting a scheme with optimal prediction effect of various classifiers under different characteristic variables.
Specifically, the feature variable selection may be obtained by permutation and combination of specific word occurrence frequency (Count), word frequency-inverse document frequency (tf-idf, i.e., Tfidf) and VADER score, such as:
FS-1:CountVectorizer,
FS-2:CountVectorizer +VADERscore,
FS-3: countvectorer top 10000 feature vector + VADERscore,
FS-4:TfidfVectorizer,
FS-5:TfidfVectorizer +VADERscore,
FS-6: tfidfvactorizer top 10000 eigenvector + VADERscore.
Further, the optimal prediction scheme is evaluated by F1-score,
F1 score = 2*(Recall * Precision) / (Recall + Precision);
wherein Recall = true positive/(true positive + false negative), Precision = true positive/(true positive + false positive).
The invention provides an application of intelligent medicine evaluation based on machine learning, which comprises multiple functions: function 1) evaluating the curative effect of a certain medicine, inputting the name of the medicine, and obtaining the effectiveness score of the medicine, the ranking in the same kind of medicine and the ranking of side effects; function 2) searching corresponding medicines for a certain disease or symptom, inputting names of single or multiple diseases or symptoms, and obtaining effectiveness scores, ranking and side effect ranking of each medicine of the single or multiple corresponding medicines; function 3) ranking the effectiveness of the medicine, the similar medicines and the side effects thereof aiming at multi-dimensional people of different ages, sexes, ethnicities, marriage and childbirth, regions and the like.
In the embodiment of the invention, the mapping relation between the medicine and the corresponding target treatment disease or symptom is determined through approved information such as a medicine use instruction book, a medicine guide and the like, the effectiveness of the medicine is predicted by utilizing information such as on-line medicine comments, medical record lists, follow-up records and the like, and the process of establishing a prediction model can be divided into the following steps: firstly, performing emotion analysis on a statement through a VADER (variable amplitude error rate), calibrating whether a medicine aimed at by the statement is effective, then dividing a structured data set into a training set and a verification set, and training a plurality of classifiers (models) of two classes in different feature extraction modes to obtain an optimal scheme. On the application level, the prediction result of whether the medicine is effective is applied to the initially determined mapping relation, and the effective rate of each medicine for treating the disease or symptom and the side effect thereof and the effective rate of a single or a plurality of similar medicines for treating the disease or symptom with the same target are calculated. Therefore, when a certain medicine is input at the user end, the effective rate and the side effect of the medicine corresponding to the target treatment disease or symptom and the effective rates of similar medicines can appear; when a disease or symptom is inputted, its effective rate corresponding to a single or multiple drugs and its respective side effects may appear. In addition, the information of the age, sex, race, marriage and childbearing, region, etc. of the people taking the medicine and the effective rate of the subdivided people can be known by screening the crowd information.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flow chart of an intelligent drug efficacy evaluation method based on machine learning in an embodiment of the present invention.
Fig. 2 is a schematic diagram of a software operation structure of a machine learning-based intelligent drug efficacy evaluation application in an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be described below in a clear and complete manner with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the invention. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
The following describes embodiments of the present invention in detail.
Example 1
Referring to fig. 1, fig. 1 is a schematic flowchart of a method for evaluating a therapeutic effect of an intelligent drug based on machine learning according to an embodiment of the present invention, and as shown in fig. 1, the method for evaluating a therapeutic effect of an intelligent drug based on machine learning includes:
101. and establishing a mapping relation between the medicine and the corresponding target treatment disease or symptom.
As shown in Table 1, the disease or symptom treated by metronidazole is septicemia, endocarditis, meningitis, colitis, tetanus, oral ulcer, etc. Extracting the target treatment disease or symptom corresponding to the medicine from the medicine use instruction or medicine guide, and respectively marking the medicine as
Figure 820750DEST_PATH_IMAGE001
Corresponding to the target treatment disease or symptom is
Figure 97011DEST_PATH_IMAGE002
. In this example
Figure 330546DEST_PATH_IMAGE001
= metronidazole (alpha-nitrozole),
Figure 692257DEST_PATH_IMAGE002
= { septicemia, endocarditis, meningitis, colitis, tetanus, canker sore. After the mapping relationship between the drugs and the corresponding target treatment diseases or symptoms is established, step 102 is performed.
102. Extracting the corresponding potential side effects of the medicines, and calculating similarity indexes among the medicines.
As shown in Table 1, potential side effects of metronidazole use are nausea, vomiting, loss of appetite, abdominal cramps, headache, dizziness, paresthesia, numbness of the extremities, etc. Extracting the corresponding potential side effects of the medicine in the medicine use instruction or medicine guide and marking the extracted potential side effects as
Figure 544807DEST_PATH_IMAGE003
. In this example
Figure 941153DEST_PATH_IMAGE003
= { nausea, vomiting, loss of appetite, abdominal cramps, headache, dizziness, paresthesia, numbness of limbs.
As shown in the table 2 below, the following examples,
Figure 345590DEST_PATH_IMAGE005
= Ornidazole, use of drug Ornidazole for treatment of disease or symptoms
Figure 69963DEST_PATH_IMAGE008
= { septicemia, amebiasis, meningitis, periodontitis, endometritis, canker sore. }, potential side effects
Figure 585258DEST_PATH_IMAGE009
= { nausea, oral malodor, dizziness, drowsiness, rash, spasm, confusion, numbness of limbs.
The similarity index of the two drugs is calculated,
Figure RE-DEST_PATH_IMAGE011
further, a dictionary of data for designing a drug to treat a disease or symptom includes upper and lower concepts of a disease or symptom, such as periodontitis and oral ulcer are both oral infections, and if the upper concept of oral infections is used, the similarity index of two drugs is 3/8= 0.375.
Furthermore, a similar word merging data dictionary is designed, which contains words with similar levels that can be considered to be similar, such as { numbness of limbs } and { numbness of limbs }, { headache } and { dizziness }, etc.
Figure 411449DEST_PATH_IMAGE011
Figure 623118DEST_PATH_IMAGE012
103. And marking the on-line comments, the medical record list and the follow-up record of the medicine with labels to mark whether the medicine is effective or not.
Specifically, the sentence was rated as a value of-1 (negative) to 1 (positive) with a value of-0 being a neutral opinion using a VADER (value Aware Dictionary and sEntiment reader). Using the VADER module in the statistical analysis software python, using the polarity _ score method, four scores are given for the sentence: (a) negation, (b) aggressiveness, (c) neutral score, (d) composite sentiment score. The composite score is the sum of the first three scores and is used for measuring positive or negative emotion of the sentence. The application is suitable for emotion analysis of English sentences, so that all data sources for drug efficacy evaluation are mainly English as much as possible, for example, Chinese texts are collected and can be translated into English by an automatic translator and manually checked.
104. And structuring the text data, and extracting multi-dimensional crowd information and medicine and treatment related feature vectors.
The crowd information comprises but is not limited to age, gender, race, marriage and childbirth conditions, regions and the like, such information of the on-line medicine comments can be obtained by a computer background database, the medical record management system of the hospital can also obtain such information, and the follow-up records should contain the crowd information as much as possible before the follow-up survey is designed.
Words which can reflect important characteristics of texts in online medicine comments, hospital medical records and follow-up records are converted into vector forms through word frequency (CountVec) and word frequency-inverse document frequency (tf-idf).
Further, the feature vector may have the following rules, for example:
FS-1:CountVectorizer,
FS-2:CountVectorizer +VADERscore,
FS-3: countvectorer top 10000 feature vector + VADERscore,
FS-4:TfidfVectorizer,
FS-5:TfidfVectorizer +VADERscore,
FS-6: tfidfvactorizer top 10000 eigenvector + VADERscore.
105. The structured data is divided into a training set and a validation set.
And dividing the data set converted into the vector form in the steps into a training set and a verification set according to the ratio of 8:2, 7:3 or 6: 4.
106. Various algorithms are selected as classifiers to predict the binary problem.
Further, four commonly used algorithms are chosen to train the classifier, such as a) OneVsRest SVM, b) Logistic Regression, c) Random Forest, d) Bagging meta-estimator with local classifier base.
107. And establishing different characteristic variable selection mechanisms, and selecting a scheme with optimal prediction effect of various classifiers under different characteristic variables.
As shown in Table 3, Table 3 shows F1-score of the data training result obtained by using the four classifiers in step 106 and the six feature variable selection rule permutation combination in step 104,
F1 score = 2*(Recall * Precision) / (Recall + Precision);
wherein Recall = true positive/(true positive + false negative), Precision = true positive/(true positive + false positive). Recall is Recall and Precision.
Figure 942104DEST_PATH_IMAGE013
108. And calculating the effective rate of the medicine aiming at the target treatment disease or symptom by using the optimal scheme obtained by training.
The optimal solution for predicting the disease or symptom of the target treatment obtained from table 3 is random forest (RandomForest), and the feature extraction mode is FS-6: tfidfvactorizer top 10000 eigenvector + VADERscore. The corresponding F1-score is 0.760. The scheme is utilized to predict the data which are not labeled, and the effective rates of a certain medicine for treating different diseases or symptoms are respectively calculated.
109. And obtaining the ranking of the effective rate and the ranking of the potential side effects of the similar medicines according to the medicine similarity index.
Drug similarity index.
Aiming at a certain drug, such as metronidazole, the similarity index of other drugs and the drug is calculated by the method in the step 102, the first five drugs can be taken, and the effective rates of the drug and the similar drugs are respectively calculated by a model. And (4) counting the ranking of the side effects generated by the medicine in all the data sources by using the characteristic words of the potential side effects of the medicine extracted in the step 102, wherein the ranking can be 10, or the ranking can be the top side effect according to the situation.
Example 2
Based on the intelligent drug efficacy evaluation method described in the above embodiment, an intelligent drug efficacy evaluation application based on machine learning is developed, the application background includes a database for collecting and managing the above different data sources, the middle station includes an intelligent drug efficacy evaluation method capable of model parameter adjustment and real-time monitoring, and the foreground can implement the following functions:
1) evaluating the curative effect of a certain medicine, inputting the name of the medicine to obtain the effectiveness score of the medicine, ranking in the same medicine and ranking the side effect;
the medicine effectiveness score is the effective rate of the medicine obtained by the medicine curative effect evaluation model.
2) Searching corresponding medicines aiming at a certain disease or symptom, inputting names of single or multiple diseases or symptoms, and obtaining effectiveness scores, ranking and side effect ranking of each medicine of the single or multiple corresponding medicines;
the disease or symptom is input into the system, the mapping relation between the medicine obtained in step 101 of example 1 and the target treatment disease or symptom is used to find the corresponding medicine, and the effectiveness score, the ranking, the potential side effect ranking and the like of each medicine are respectively calculated through the model.
3) Aiming at multi-dimensional people with different ages, sexes, ethnicities, marriage and childbirth, regions and the like, the effectiveness of the medicine, the medicines of the same kind and the ranking of side effects are carried out;
the crowd information is used as a screening condition for calculating the effectiveness of the medicine and searching the similar medicine and the ranking of the side effects thereof.

Claims (9)

1. An intelligent drug efficacy evaluation method based on machine learning is characterized by comprising the following steps:
1) extracting the mapping relation between the medicine and the corresponding treatment disease or symptom through the medicine use instruction and the medicine guide of the medical supervision bureau: supposing that the medicine is
Figure DEST_PATH_IMAGE001
I = 1.. I, which corresponds to a target treatment disease or symptom of I
Figure 557760DEST_PATH_IMAGE002
J = 1.. J is J of J target diseases or symptoms, with the corresponding potential side effects of J target diseases or symptoms
Figure DEST_PATH_IMAGE003
K = 1,.. K is K potential side effects; calculating medicine
Figure 887110DEST_PATH_IMAGE001
Similar drug indices between them;
2) treating diseases or symptoms of the on-line medicine comments, the medical record sheets of the hospitals and the follow-up records according to the medicine targets in the step 1)
Figure 695797DEST_PATH_IMAGE002
Grouping, labeling each comment and medical record sheet, and respectively marking the comment and the medical record sheet as 'effective' or 'ineffective';
3) structuring the text data:
a) extracting multi-dimensional crowd information such as age, gender, race, marriage and childbirth, region and the like,
b) extracting a feature vector: extracting characteristic words or phrases from online medicine comments, medical history lists and follow-up records to obtain characteristic vectors;
4) dividing a data set converted from text data into a structured data set into a training set and a verification set according to a certain proportion;
5) selecting a plurality of algorithms as classifiers for predicting the two-classification problem;
6) establishing different characteristic variable selection mechanisms, and selecting a scheme with optimal prediction effect of various classifiers under different characteristic variables;
7) calculating the medicine by using the optimal scheme obtained by the training in the step 6)
Figure 99097DEST_PATH_IMAGE001
Treatment of diseases or conditions for a target
Figure 826882DEST_PATH_IMAGE002
The ranking of the effective rate of the medicine in the same class of medicines is obtained according to the medicine similarity index calculated in the step 1), and the potential side effect of the medicine in the step 1) is obtained
Figure 315632DEST_PATH_IMAGE004
And extracting characteristic words in the data set and ranking.
2. The method for evaluating the curative effect of the intelligent drug according to claim 1, wherein the labeling in the step 2) is performed automatically, emotion analysis is performed according to semantics, a sentence is scored as a value from-1 (negative) to 1 (positive) by using a VADER (value Aware Dictionary and sEntiment reader), and 0 is a neutral opinion.
3. The method for evaluating the curative effect of an intelligent drug according to claim 2, wherein the labeling in step 2) is automated and then manually checked.
4. The method for evaluating the curative effect of an intelligent drug according to claim 1, wherein the training set and the validation set in step 4) can be classified into 8:2 or 7: 3.
5. The method for evaluating the curative effect of an intelligent drug according to claim 1, wherein the classifiers in the step 5) are four types:
a)OneVsRest SVM,
b) Logistic Regression,
c) Random Forest,
d) Bagging meta-estimator with logistic regressor base。
6. the method for evaluating the curative effect of an intelligent drug according to claim 1, wherein the characteristic variables in step 6) are selected from a list of specific word occurrence frequencies (Count), word frequency-inverse document frequency (tf-idf, Tfidf) and VADER scores, such as:
FS-1:CountVectorizer,
FS-2:CountVectorizer +VADERscore,
FS-3: countvectorer top 10000 feature vector + VADERscore,
FS-4:TfidfVectorizer,
FS-5:TfidfVectorizer +VADERscore,
FS-6: tfidfvactorizer top 10000 eigenvector + VADERscore.
7. The method for evaluating the therapeutic effect of a smart drug according to claim 1, wherein the optimal prediction in step 6) is evaluated by F1-score, F1 score = 2 (decrease Precision)/(decrease + Precision); wherein Recall = true positive/(true positive + false negative), Precision = true positive/(true positive + false positive).
8. The method for evaluating the efficacy of an intelligent drug according to claim 1, wherein the similarity index of drugs in step 1) is
Figure DEST_PATH_IMAGE005
The calculation method comprises the following steps: supposing that the medicine is
Figure 646119DEST_PATH_IMAGE001
The corresponding target treatment disease or symptom is
Figure 903925DEST_PATH_IMAGE006
(ii) a Medicine and food additive
Figure DEST_PATH_IMAGE007
The corresponding target treatment disease or symptom is
Figure 677977DEST_PATH_IMAGE008
(ii) a Then
Figure DEST_PATH_IMAGE009
9. A machine learning based intelligent drug efficacy assessment application that can perform multiple functions according to the method of any of claims 1 to 8, comprising:
function 1) evaluating the curative effect of a certain medicine, inputting the name of the medicine, and obtaining the effectiveness score of the medicine, the ranking in the same kind of medicine and the ranking of side effects;
function 2) searching corresponding medicines for a certain disease or symptom, inputting names of single or multiple diseases or symptoms, and obtaining effectiveness scores, ranking and side effect ranking of each medicine of the single or multiple corresponding medicines;
function 3) ranking the effectiveness of the medicine, the similar medicines and the side effects thereof aiming at multi-dimensional people of different ages, sexes, ethnicities, marriage and childbirth, regions and the like.
CN202111135248.5A 2021-09-27 2021-09-27 Intelligent medicine curative effect evaluation method based on machine learning and application thereof Active CN113838583B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111135248.5A CN113838583B (en) 2021-09-27 2021-09-27 Intelligent medicine curative effect evaluation method based on machine learning and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111135248.5A CN113838583B (en) 2021-09-27 2021-09-27 Intelligent medicine curative effect evaluation method based on machine learning and application thereof

Publications (2)

Publication Number Publication Date
CN113838583A true CN113838583A (en) 2021-12-24
CN113838583B CN113838583B (en) 2023-10-24

Family

ID=78970737

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111135248.5A Active CN113838583B (en) 2021-09-27 2021-09-27 Intelligent medicine curative effect evaluation method based on machine learning and application thereof

Country Status (1)

Country Link
CN (1) CN113838583B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115497630A (en) * 2022-08-24 2022-12-20 中国医学科学院北京协和医院 Method and system for processing acute severe ulcerative colitis data
CN116758062A (en) * 2023-08-11 2023-09-15 之江实验室 Drug effectiveness evaluation method and device
CN118072980A (en) * 2024-04-18 2024-05-24 首都医科大学附属北京儿童医院 Method and related equipment for evaluating mucociliary clearance function of mucous membrane in nasal cavity

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104200069A (en) * 2014-08-13 2014-12-10 周晋 Drug use recommendation system and method based on symptom analysis and machine learning
CN104951665A (en) * 2015-07-22 2015-09-30 浙江大学 Method and system of medicine recommendation
CN107092797A (en) * 2017-04-26 2017-08-25 广东亿荣电子商务有限公司 A kind of medicine proposed algorithm based on deep learning
CN107403069A (en) * 2017-07-31 2017-11-28 京东方科技集团股份有限公司 A kind of medicine disease association relationship analysis system and method
US20190035496A1 (en) * 2016-02-29 2019-01-31 Mor Research Applications Ltd System and method for selecting optimal medications for a specific patient
CN111599403A (en) * 2020-05-22 2020-08-28 电子科技大学 Parallel drug-target correlation prediction method based on sequencing learning
CN112116978A (en) * 2020-09-17 2020-12-22 陕西师范大学 Method, system and device for recommending rheumatism immunity medicine
US20210134418A1 (en) * 2019-11-04 2021-05-06 Georgetown University Method and System for Assessing Drug Efficacy Using Multiple Graph Kernel Fusion
CN113160879A (en) * 2021-04-25 2021-07-23 上海基绪康生物科技有限公司 Method for predicting drug relocation through side effect based on network learning
CN113241193A (en) * 2021-06-01 2021-08-10 平安科技(深圳)有限公司 Drug recommendation model training method, recommendation method, device, equipment and medium
CN113316720A (en) * 2019-01-15 2021-08-27 国际商业机器公司 Determining a drug effectiveness ranking for a patient using machine learning

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104200069A (en) * 2014-08-13 2014-12-10 周晋 Drug use recommendation system and method based on symptom analysis and machine learning
CN104951665A (en) * 2015-07-22 2015-09-30 浙江大学 Method and system of medicine recommendation
US20190035496A1 (en) * 2016-02-29 2019-01-31 Mor Research Applications Ltd System and method for selecting optimal medications for a specific patient
CN107092797A (en) * 2017-04-26 2017-08-25 广东亿荣电子商务有限公司 A kind of medicine proposed algorithm based on deep learning
CN107403069A (en) * 2017-07-31 2017-11-28 京东方科技集团股份有限公司 A kind of medicine disease association relationship analysis system and method
CN113316720A (en) * 2019-01-15 2021-08-27 国际商业机器公司 Determining a drug effectiveness ranking for a patient using machine learning
US20210134418A1 (en) * 2019-11-04 2021-05-06 Georgetown University Method and System for Assessing Drug Efficacy Using Multiple Graph Kernel Fusion
CN111599403A (en) * 2020-05-22 2020-08-28 电子科技大学 Parallel drug-target correlation prediction method based on sequencing learning
CN112116978A (en) * 2020-09-17 2020-12-22 陕西师范大学 Method, system and device for recommending rheumatism immunity medicine
CN113160879A (en) * 2021-04-25 2021-07-23 上海基绪康生物科技有限公司 Method for predicting drug relocation through side effect based on network learning
CN113241193A (en) * 2021-06-01 2021-08-10 平安科技(深圳)有限公司 Drug recommendation model training method, recommendation method, device, equipment and medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115497630A (en) * 2022-08-24 2022-12-20 中国医学科学院北京协和医院 Method and system for processing acute severe ulcerative colitis data
CN115497630B (en) * 2022-08-24 2023-11-03 中国医学科学院北京协和医院 Method and system for processing acute severe ulcerative colitis data
CN116758062A (en) * 2023-08-11 2023-09-15 之江实验室 Drug effectiveness evaluation method and device
CN118072980A (en) * 2024-04-18 2024-05-24 首都医科大学附属北京儿童医院 Method and related equipment for evaluating mucociliary clearance function of mucous membrane in nasal cavity

Also Published As

Publication number Publication date
CN113838583B (en) 2023-10-24

Similar Documents

Publication Publication Date Title
Basiri et al. A novel method for sentiment classification of drug reviews using fusion of deep and machine learning techniques
Castillo-Sánchez et al. Suicide risk assessment using machine learning and social networks: a scoping review
Perez et al. Semi-supervised medical entity recognition: A study on Spanish and Swedish clinical corpora
CN113838583A (en) Intelligent drug efficacy evaluation method based on machine learning and application thereof
Ramachandran et al. Named entity recognition on bio-medical literature documents using hybrid based approach
Liu et al. Extracting features with medical sentiment lexicon and position encoding for drug reviews
Wu et al. KAICD: A knowledge attention-based deep learning framework for automatic ICD coding
Shen et al. Enhancing ontology-driven diagnostic reasoning with a symptom-dependency-aware Naïve Bayes classifier
Afsana et al. Automatically assessing quality of online health articles
Falissard et al. Neural translation and automated recognition of ICD-10 medical entities from natural language: Model development and performance assessment
Taghizadeh et al. SINA-BERT: a pre-trained language model for analysis of medical texts in Persian
Rakhsha et al. Detecting adverse drug reactions from social media based on multichannel convolutional neural networks modified by support vector machine
Al-Jefri et al. Using machine learning for automatic identification of evidence-based health information on the web
Chaturvedi et al. Identifying mentions of pain in mental health records text: a natural language processing approach
Roosan et al. Artificial intelligent context-aware machine-learning tool to detect adverse drug events from social media platforms
Al Amin et al. Data driven classification of opioid patients using machine learning–an investigation
Cousyn et al. Towards using scientific publications to automatically extract information on rare diseases
Liu et al. Sentiment classification with medical word embeddings and sequence representation for drug reviews
Al-Smadi DeBERTa-BiLSTM: A multi-label classification model of Arabic medical questions using pre-trained models and deep learning
Vithanage et al. Contextual Word Embedding for Biomedical Knowledge Extraction: A Rapid Review and Case Study
Liu et al. Clinical quantitative information recognition and entity-quantity association from Chinese electronic medical records
Shi et al. Enhancing efficiency and capacity of telehealth services with intelligent triage: a bidirectional LSTM neural network model employing character embedding
He et al. A method of electronic medical record similarity computation
Raza Improving Clinical Decision Making with a Two-Stage Recommender System: A Case Study on MIMIC-III Dataset
Gatto et al. HealthE: Recognizing Health Advice & Entities in Online Health Communities

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant