CN108831563B - Decision method for distinguishing classification detection of adverse drug reaction signals - Google Patents

Decision method for distinguishing classification detection of adverse drug reaction signals Download PDF

Info

Publication number
CN108831563B
CN108831563B CN201810275119.8A CN201810275119A CN108831563B CN 108831563 B CN108831563 B CN 108831563B CN 201810275119 A CN201810275119 A CN 201810275119A CN 108831563 B CN108831563 B CN 108831563B
Authority
CN
China
Prior art keywords
sample
adr
data
total
decision
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810275119.8A
Other languages
Chinese (zh)
Other versions
CN108831563A (en
Inventor
魏建香
张剑吟
刘美含
郭林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN201810275119.8A priority Critical patent/CN108831563B/en
Publication of CN108831563A publication Critical patent/CN108831563A/en
Application granted granted Critical
Publication of CN108831563B publication Critical patent/CN108831563B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references
    • G16H70/40ICT specially adapted for the handling or processing of medical references relating to drugs, e.g. their side effects or intended usage
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H20/00ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance
    • G16H20/10ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to drugs or medications, e.g. for ensuring correct administration to patients

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Primary Health Care (AREA)
  • Medical Informatics (AREA)
  • Data Mining & Analysis (AREA)
  • Medicinal Chemistry (AREA)
  • Chemical & Material Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Toxicology (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The invention discloses a decision method for judging the classified detection of adverse drug reaction signals, which is based on the Adverse Drug Reaction (ADR) report data of China and is used for researching the problem of whether the classified detection is carried out during the ADR signal detection, namely judging the problem of whether the independent signal detection is carried out on the traditional Chinese medicine data. And designing a discrimination index based on a standard library, constructing a decision tree for classification basis, comparing differences of ADR signals extracted from a total sample and a sub-sample by using three signal detection methods of PRR, MHRA and IC, and making a decision, and finally giving a suggestion whether to perform classification detection on the total data. The invention provides a referable method for Chinese adverse drug reaction signal detection data classification decision.

Description

Decision method for distinguishing classification detection of adverse drug reaction signals
Technical Field
The invention belongs to the field of signal detection, and particularly relates to a decision method for judging the classification detection of adverse drug reaction signals.
Background
At present, ADR monitoring data are detected internationally by mainly adopting various signal detection methods including PRR, ROR, MHRA, IC and the like, but the ADR signal detection technology in China is still incomplete, and the main problems are as follows:
(1) a signal detection method which accords with the ADR data quality in China is not established in China, and the method which is mainstream internationally is adopted as follows: PRR, ROR, MHRA, IC, and the like. The method is applied to ADR data in China by multiple scholars in China, the consistency of detection results is poor, and the coexistence of multiple detection methods brings difficulty to signal detection;
(2) all medicines such as traditional Chinese medicines, western medicines, biological products and the like are contained in an ADR data base in China, and signal detection is not carried out according to medicine classification. Due to the fact that the data volume is large, the national adverse drug reaction monitoring center needs to spend a long time on signal detection of all data once, and a large number of screened signals need to be manually analyzed by experts, so that the efficiency of signal discovery is low. There is no relevant study on whether ADR data can be classified for signal detection.
(3) The traditional Chinese medicine has complex components and is essentially different from western medicines. In China, whether the adverse reaction report of the traditional Chinese medicine is subjected to single signal detection is always a controversial problem.
Disclosure of Invention
The technical problem to be solved is as follows: the invention solves the defects in the prior art, starts from the analysis of application results of four ADR signal detection methods which are mainstream internationally in ADR data in China, selects three methods for decision research, constructs a decision tree for classification and judgment, selects partial data from a total sample as a sub-sample, respectively performs signal detection on the total sample and the sub-sample by using the three ADR signal detection methods, and finally provides a decision result for classifying and detecting the ADR data by determining whether the adverse reaction data of the total sample is classified or not.
The technical scheme is as follows: a decision-making method for discriminating between categorical detections of adverse drug reaction signals, the decision-making method comprising the steps of:
1) data acquisition and processing: wherein the data acquisition comprises acquisition of raw ADR data and acquisition of a standard library; processing data, namely deleting medicines and adverse reaction data thereof which are not included in a standard library from the original ADR data, deleting data with the ADR frequency less than 3 to obtain a total sample, and separating the total sample into a sub-sample and a residual sample;
2) signal detection of data: performing signal detection on the total sample and the subsample by an ADR signal detection method;
3) judging the effectiveness of the total sample and the sub-sample before and after separation by using a decision index based on a standard library; the recall ratio R, the precision ratio P and the difference ratio D are used as decision indexes, and the decision process is executed as follows:
301) designing a four-grid table based on a standard library: labeling the detection result based on the total sample and the sub-sample by using a drug-adverse reaction combination in a standard library, wherein if the combination appears in the standard library, the combination is labeled as 1, and if not, the combination is labeled as 0;
the elements in the four-grid table are respectively a (a)1,a0)、b(b1,b0)、c(c1,c0) And d (d)1,d0) (ii) a Wherein, a represents the ADR combination number of positive signals detected by the total sample and the subsample; b represents the number of ADR combinations which are detected as positive signals in the total sample and negative signals in the subsample; c represents the number of ADR combinations which are detected as negative signals in the total sample and positive signals in the subsample; d represents the number of ADR combinations detected as negative signals in both the total and subsamples; a is1、b1、c1And d1The number of ADR combinations present in the standard library in a, b, c and d, respectively; a is0、b0、c0And d0The ADR combination numbers which are not present in the standard library in a, b, c and d are respectively represented; and a is a1+a0,b=b1+b0,c=c1+c0,d=d1+d0
302) And (3) recall ratio comparison: setting the recall ratios of the total sample and the sub-sample as R respectively1And R2
Figure BDA0001611977390000021
Figure BDA0001611977390000022
If R is1>R2And R is1And R2If the difference is greater than the first preset threshold, the classification detection of the total sample is not needed; if R is2>R1And R is1And R2If the difference is greater than a first preset threshold, the total samples need to be classified and detected; if R is1And R2If the difference is smaller than the first preset threshold, go to step 303);
303) and (3) precision ratio comparison: setting the precision ratio of the total sample and the sub-sample as P1And P2
Figure BDA0001611977390000023
Figure BDA0001611977390000024
If P1>P2And P is1And P2If the difference is greater than the first preset threshold, the classification detection of the total sample is not needed; if P2>P1And P is1And P2If the difference is greater than a first preset threshold, the total samples need to be classified and detected; if P1And P2If the difference is smaller than the first preset threshold, go to step 304);
304) and (3) comparing the difference rate: let the difference rate be D,
Figure BDA0001611977390000031
if D is larger than a second preset threshold value, the classification detection of the total samples is not needed; otherwise, the classification detection is carried out on the total samples.
Preferably, the raw ADR data is obtained from the national adverse drug reaction monitoring center; the standard library is an ADR known library established by a specification for collecting related medicines through a network, adverse reaction information reports of medicines at each stage issued by the national food and drug administration, a fast warning of medicines and various regulatory documents.
Preferably, the decision method further includes verifying a classification test result, and in a case that the decision result is "need to perform classification detection on the total sample", the remaining samples and the total sample are tested, and if the value of the decision index R, P is smaller than a first preset threshold and the value of D is smaller than a second preset threshold, the result of "need to perform classification detection on the total sample" is accepted, otherwise, the result of "need to perform classification detection on the total sample" is not accepted.
Preferably, the total sample is divided into all data including three categories of traditional Chinese medicine, western medicine and biological products.
Preferably, the subsamples are data of Chinese medicine categories; the residual sample is data of western medicine and biological product categories.
Preferably, the ADR signal detection method in step 2) is a PRR, MHRA, ROR and IC method based on an asymmetric measurement principle.
Preferably, the first preset threshold is 2%; the second predetermined threshold is 10%.
Has the advantages that: the invention researches whether the ADR signal detection process carries out classification detection on total sample data on the basis of the Chinese ADR report data, designs a detection index based on a standard library, and constructs a decision tree for classification detection.
Comparing differences of ADR signals extracted from a total sample and a sub-sample by three signal detection methods of PRR, MHRA and IC, and determining the most appropriate ADR signal detection method so as to improve timeliness and effectiveness of signal detection.
The invention provides a referable method for Chinese adverse drug reaction signal detection data classification decision.
Drawings
FIG. 1 is a decision flow diagram;
FIG. 2 is a decision tree for determining whether to classify;
in the figure, R1、R2Respectively representing the recall, P, of the total and subsamples1、P2Precision ratios representing total and subsamples, respectively, b1The number of ADR combinations representing a positive signal in the total sample and a negative signal in the subsample, c1Number of combinations representing positive signal detected from subsample and negative signal detected from total sample, b1、c1Is a known signal.
Detailed Description
The following examples are presented to enable one of ordinary skill in the art to more fully understand the present invention and are not intended to limit the invention in any way.
Example 1
1ADR monitoring data acquisition and processing
1.1 data processing
(1) Raw data summarization: 1823144 records of two-year ADR report data are obtained from the national adverse drug reaction monitoring center, wherein 608710 (accounting for 33.4%) records exist in a plurality of adverse drug reaction relationships corresponding to one drug, and the data are split into one-to-one relationships. The record in the data of the drug name or adverse reaction name "unknown" was deleted. The processed data are 2221942 records in total, and the medicine categories comprise Chinese medicines, western medicines and biological products, wherein the data amount of the Chinese medicine category is 317417 and accounts for 14.29%; the data volume of the western medicine category is 1874904, which accounts for 84.38%; the data volume for the "biologicals" category was 29621, accounting for 1.33%. Summarizing the overall data according to the general names of the medicines and the adverse medicine reaction names to obtain 139281 medicine-adverse reaction combinations and the occurrence frequency thereof, and recording the data set as Datal. The data set contained 6174 drugs and 2458 adverse reactions.
(2) Establishing a standard library: in order to judge the validity of the data classification detection result, a standard library of known adverse reactions needs to be established as a reference. An ADR known library is established by collecting the instruction of related medicines, adverse reaction information reports of medicines at each stage issued by the national food and drug administration, alert express messages of medicines, various regulation documents and the like through a network, and is called as a standard library. A total of 53774 drug-adverse reaction combinations were included in the standard library, and this Data set was recorded as Data 2. The data set contained 2401 drugs and 2460 adverse reactions.
(3) In order to maintain consistency with the drugs in the standard library, the drugs not contained in the standard library Data2 are deleted in Data1, resulting in Data set Data 3. Because the probability of being a contingency factor is high when the ADR frequency is 1 and 2, and the frequency is generally required to be more than or equal to 3 when the signal is screened, the data with the ADR frequency less than 3 is deleted, the obtained data set comprises 39782 records, and 1692 medicines and 877 adverse reactions are totally contained. Note this data set as a total sample.
(4) The Chinese medicine Data is separated from the total sample to obtain a Data set Data 4. The data set contained 4697 records with a total of 326 drugs and 283 adverse reactions. Let this data set be a sub-sample, which is a proper subset of the total sample.
(5) Drug-adverse reaction combinations in two sample sets were labeled with a standard library: combinations appearing in the standard library are labeled "1", otherwise labeled "0". The fields of the two data sets include: the type of the medicine, the general name, the name of the adverse reaction, the frequency, whether the medicine is known or not and the like.
1.2 statistical analysis of two sample data
A total of 1,972,008 ADR reports were included in the total sample, involving 1692 drug species, 39782 drug-adverse reaction combinations and 877 adverse reactions, with an average number of 1,165.49 adverse reactions per drug, an average frequency of 49.57 adverse reaction occurrences per ADR combination, and an average frequency of 2248.58 adverse reaction occurrences per adverse reaction.
The subsamples contained 199115 ADR reports, accounting for 10.1% of the total sample; 326 drugs, 4697 drug-adverse reaction combinations and 283 adverse reactions were involved, the average number of adverse reactions per drug was 610.78, 554.71 was reduced from the population, the average frequency of occurrence per ADR combination was 42.39, 7.18 was reduced from the population, the average frequency of occurrence per adverse reaction was 703.59, and the average number of adverse reactions was 1545 from the population. The data show that the average level of ADR occurrence in the subsamples (TCM) is significantly lower than in the total samples.
The total sample contained 13555 known drug-adverse reaction combinations in 34.07% ratio, compared to the standard library; the number of the known drug-adverse reaction combinations in the subsamples is 830, the proportion is 17.67%, and the number of the known drug-adverse reaction combinations is 16.4% less than that in the total sample, namely the known drug-adverse reaction combinations in the traditional Chinese medicine data are far lower than that in the total sample.
In terms of data quality, 88083 was included in the total sample, 4.47% was included, 10007 was included in the subsamples, and 5.03% was included in the total sample, so the subsamples had a relatively higher data quality than the total sample.
TABLE 1 adverse reactions with the top ten total frequency order in both samples
Adverse reaction Total sample report number (ratio) Adverse reaction Number of subsamples reported (ratio)
Rash 281399(14.27%) Rash 32161(16.15%)
Nausea 226368(11.48%) Itching (pruritus) 22736(11.42%)
Itching (pruritus) 173111(8.78%) Nausea 14339(7.20%)
Vomiting 141098(7.16%) Dizziness (lightheadedness) 9577(4.81%)
Dizziness (lightheadedness) 83278(4.22%) Shivering of chills 9206(4.62%)
Headache (headache) 56507(2.87%) Palpitations 8728(4.38%)
Abdominal pain 52291(2.65%) Vomiting 8661(4.35%)
Diarrhea (diarrhea) 49839(2.53%) Hypersensitivity-like reaction 7182(3.61%)
Hypersensitivity-like reaction 49261(2.50%) Chest stuffiness 6251(3.14%)
Shivering of chills 39066(1.98%) Generate heat 5847(2.94%)
Total up to 1152218(58.44%) 124688(62.62%)
Table 1 shows that adverse reactions with the first ten frequent bits occur in two samples, and the sum of the proportions of the adverse reactions exceeds about 60% of the amount of each sample. The first ten adverse reactions of the two groups were named 7 ADRs, which were the same, but were different in "headache", "abdominal pain" and "diarrhea" in the total sample, and "palpitation", "chest distress" and "fever" in the sub-samples. The three top ranked adverse reactions were both "rash", "nausea" and "itching", but the proportion of "nausea" in the total sample was 4.28% higher than that of the subsamples, while "itching" was 2.64% lower than that of the subsamples. The proportion of each adverse reaction in the sample volume was different, the most different was "nausea", and the sub-samples were 3.36% less than the total sample.
The above analysis and comparison of two sample data shows that there is some difference between the two samples in many aspects, but it cannot be determined whether there is difference in signal detection, and further comparison and study are needed.
3 detection methods application analysis and selection
3.1 Signal detection method
At present, the ADR signal detection method in China adopts four main methods based on an asymmetric measurement principle: PRR, MHRA, ROR and IC, the calculation formulas of which are based on the following classical four-grid table.
TABLE 2 classical four-grid table
Figure BDA0001611977390000061
(1) PRR method
Figure BDA0001611977390000062
Figure BDA0001611977390000063
95%CI=eln(PRR)±196SE(ln PRR)Equation 3
The cutoff for signal generation was 95% CI lower limit > 1, i.e.:
Figure BDA0001611977390000064
(2) ROR method
The reporting ratios ratio (ROR) method was first proposed by the dutch center for drug alert (Lareb) laboratory and calculated by the formula:
Figure BDA0001611977390000065
Figure BDA0001611977390000066
95%CI=eln(ROR)±1.96SE(ln ROR)equation 7
The cutoff for signal generation was 95% CI lower limit > 1, i.e.:
Figure BDA0001611977390000067
(3) MHRA method
MHRA is a comprehensive standard method adopted by the United kingdom drug and health product administration (MHRA), namely combining PRR values, absolute report numbers and Pearson x2The value is used for evaluating the correlation strength of the signal, namely MHRA method, and the critical value of signal judgment is as follows: PRR is more than or equal to 2, A is more than or equal to 3, x2≥4。
(4) IC method
In 2002, a new drug adverse reaction signal detection method called Bayesian Confidence Probability Neural Network (BCPNN) is established by Bate and the like of Uppsala Monitoring Center (UMC). This method is currently called an IC method because it uses Information Components (ICs) in informatics. The calculation formula is as follows:
n + a + b + c + d formula 9
r=(N+2)2/((a + b +1) (a + c +1)) equation 10
E=log2((a +1) r/(N + r)) formula 11
v1(N-a + r-1)/((a +1) (1+ N + r)) formula 12
v2(N-a-b +1)/((a + b +1) (3+ N)) formula 13
v3(N-a-c +1)/((a + c +1) (3+ N)) formula 14
v=(log102)-2(v1+v2+v3) Equation 15
Figure BDA0001611977390000071
The signal judgment standard is as follows: IC > 0.
3.2 Signal detection method selection
The four main-stream signal detection methods are applied to the total sample and the subsample respectively, and the results of positive signals are compared as follows:
TABLE 3 semaphore detected by each method in two samples
Figure BDA0001611977390000072
As can be seen from table 3, the relationship between the number of signals generated by each detection method applied to two samples is consistent: and displaying descending relation according to PRR, ROR, MHRA and IC. The positive signal is represented by '1', the non-positive signal is represented by '0', and the correlation coefficient of the detection results of the four methods is shown in Table 4.
TABLE 4 correlation coefficient between detection methods in Total sample
Figure BDA0001611977390000073
TABLE 5 correlation coefficient between detection methods in subsamples
Figure BDA0001611977390000074
Figure BDA0001611977390000081
As can be seen from tables 4 and 5, the correlation coefficient between the PRR method and the ROR method is close to 1, i.e., the results of the two methods are substantially consistent; the correlation coefficient of MHRA, PRR and ROR exceeds 82 percent; the correlation coefficient between IC and the other three methods is low. It can also be seen that the correlation coefficient for several method results for the subsamples is reduced compared to the total sample, especially for the IC method by approximately 8%. Therefore, the invention applies to select three methods of PRR, MHRA and IC as the detection method for classifying data.
Design of 4 decision method
4.1 decision flow design
Judging the quality of the signal detection results of two samples before and after data separation, and solving the following problems: what is the flow of the decision? How effective the signal detection method is applied to two samples before and after data separation? What metrics are needed to make a decision? What influence will be produced by the signal detection result of the remaining data after the traditional Chinese medicine data is separated from the total sample? First, a flow chart of decision is given, as shown in fig. 1.
Separating the traditional Chinese medicine sample from the total sample as a sub-sample, using the residual data as a verified sample, respectively carrying out signal detection on the total sample and the sub-sample by using a detection method and a method given in analysis and selection to respectively obtain two result sets, constructing a decision method, and respectively judging the advantages and disadvantages of the two result sets to obtain a decision conclusion. And verifying the result by using the residual samples, and judging whether to accept a decision conclusion.
4.2 design four-grid Table based on Standard library
The standard library provides objective basis for classification decision-making. To compare the validity of the two data sets before and after the separation, we selected a standard library as the standard for the test. The test results based on total and subsamples are labeled with the drug-adverse reaction combination in the standard library, with a label of "1" if the combination appears in the standard library, and "0" otherwise. Using the two samples for signal detection and comparison with the standard library, the following four-table, see Table 6, can be constructed.
TABLE 6 comparison of two sample test results against a four-table based on a standard library
Sub sample Sub-sample-
Total sample + a(a1,a0) b(b1,b0)
Total sample- c(c1,c0) d(d1,d0)
In table 6, a represents the number of ADR combinations detected as positive signals in both total and subsamples; b represents the number of ADR combinations which are detected as positive signals in the total sample and negative signals in the subsample; c represents the number of ADR combinations which are detected as negative signals in the total sample and positive signals in the subsample; d represents the number of ADR combinations detected as negative signals in both the total and subsamples; a is1、b1、c1And d1The number of ADR combinations present in the standard library in a, b, c and d, respectively, b can be expressed1Number of signals of standard library regarded as missed in subsample, c1Number of signals of the standard library regarded as missed in the total sample, b1、c1The combination in (1) is the difference signal detected by the two samples, and can be used as a basis for classification decision; a is0、b0、c0And d0The ADR combination numbers which are not present in the standard library in a, b, c and d are respectively represented; and a is a1+a0,b=b1+b0,c=c1+c0,d=d1+d0
4.3 decision index design
(1) Decision index R-recall ratio
Recall refers to a known signal from a standard library (i.e., a)1+b1+c1+d1) The ratio of the detected signal is a measure of the coverage of the known signal. By recall ratio R1Describing the ability of the total sample to detect the known signal, as shown in equation 17:
Figure BDA0001611977390000091
by recall ratio R2Describing the ability of the subsample to detect the known signal, as shown in equation 18:
Figure BDA0001611977390000092
the comparison of the recall ratios of the two samples distinguishes the ability of the two to detect known ADR signals, and also reflects the difference in sensitivity of the detected signals of the two samples based on the standard library. Therefore, recall is a key index and should be used as the primary basis for classification decision.
(2) Decision index P-precision
The drawback of the equations 17 and 18 is that when a is1Far greater than b1And c1When, even b1And c1The difference is large, but the difference between the two is not significant. Thus, at R1And R2When the difference is not large, precision ratio also needs to be defined. The precision ratio is a ratio of a known signal in a result obtained by performing signal detection based on a certain sample, and is a measure of the accuracy of the known signal detection. By precision ratio P1Describing the ability of the total sample to detect a known signal, it is expressed by equation 19:
Figure BDA0001611977390000093
by precision ratio P2The ratio of the known signal detected by the descriptor sample is expressed by equation 20:
Figure BDA0001611977390000094
(3) decision index D-Difference Rate
The difference rate represents a measure of the difference between the two test results, and is represented by equation 21:
Figure BDA0001611977390000095
4.4 decision Tree design
And (3) determining whether the traditional Chinese medicine data is classified and detected by using R, P, D indexes of the total sample and the subsample, wherein the implementation process is as follows:
1) the primary basis for the decision is to consider the coverage of the known signal based on the results of signal detection of the two types of samples, i.e., the signal recall ratio of the two samples (see equations 17 and 18). Selecting samples with high recall ratio for signal detection, e.g. total sample recall ratio R1If the total sample is high, classification detection on the total sample is not needed; if the sub-sample recall ratio R2If the total sample is high, the total sample needs to be classified and detected; if the difference is not large, further decision is needed (setting the threshold to 2%).
2) Difference in recall at two samplesIn the case of the small size, the accuracy of signal detection, i.e., the signal precision ratio of the two (see equations 19 and 20), needs to be considered. If the overall sample precision ratio P1If the total sample is high, classification detection on the total sample is not needed; if the sub-sample precision ratio P2If the total sample is high, the total sample needs to be classified and detected; if the difference is not large, further decision is needed (setting the threshold to 2%).
3) When the recall ratio and the precision ratio of the two samples are not large (namely the difference between the two is within 2%), directly comparing the difference ratio of the two samples, and if the difference ratio is larger than a given threshold (the set threshold is 10%), determining that the total samples do not need to be classified; otherwise, the result is that the classification detection of the total sample is required.
4) And under the condition that the decision result is that the total sample needs to be classified and detected, testing the total sample and the residual data (including western medicines and biological products) after the traditional Chinese medicines are separated, if the test result is within an acceptable range in three indexes of a decision index formula 19, a decision index formula 20 and a decision index formula 21, accepting the result that the total sample needs to be classified and detected, and otherwise, not accepting the result that the total sample needs to be classified and detected. Based on the above analysis, a decision tree for whether data is classified or not is constructed as shown in fig. 2.
5 classification detection result analysis and decision
The experimental steps of classification decision are as follows:
(1) respectively calculating b, c and d (see table 3) of each drug-adverse reaction combination in the total sample and the subsample, and respectively calculating the values of the three detection methods;
(2) for ease of comparison, data in the total sample that is not "chinese drug" is removed to form the remaining data, which is the resulting validation sample. The total sample and the subsample after treatment contain drug-adverse reaction combinations of 4697 traditional Chinese medicines and have a one-to-one correspondence relationship;
(3) extracting signals from the total sample and the subsample by using three detection method signal judgment standards and comparing the signals with a standard library;
(4) the classification decision is made using the detection method of section 4.
The total sample and the subsample were tested separately using the ADR signal detection methods PRR, MHRA and IC, and the results are shown in table 7:
TABLE 7 comparison table of test results of three test methods
Figure BDA0001611977390000101
Figure BDA0001611977390000111
TABLE 8 decision basis table for discriminating Chinese medicine data classification detection
Figure BDA0001611977390000112
The decision basis table (Table 8) is built using the data in Table 7 and the classification decision tree. According to the decision tree constructed in 4.3, the decision process is as follows:
(1) PRR method: due to R1-R2=0.72%<2%,P1-P20.16% < 2% and D ═ 7.16% < 10%, according to decision tree, the conclusion is "need to classify the total sample.
(2) MHRA method: due to R1-R2=0.16%<2%,P1-P20.972% and 3.26% < 10%, according to decision tree, the conclusion is "need to classify the total sample for detection".
(3) An IC method comprises the following steps: due to R1-R2=1.44%<2%,P2-P14.06%, and judging according to the decision tree, and the conclusion is 'need to classify and detect the total samples'.
The decision paths of PRR and MHRA are consistent, and the IC method is characterized in that the recall ratio of two samples is very close, and the precision ratio of traditional Chinese medicine data is obviously better than that of a total sample, so that the conclusion is 'the need of carrying out classification detection on the total sample'. Because the decision results of the three methods are all classified and detected on the total sample, the test needs to be carried out by using the residual data (western medicines and biological products) according to the same method, and the detected index data are as follows:
TABLE 9 decision basis table for verifying residual data classification detection
Figure BDA0001611977390000113
Figure BDA0001611977390000121
As can be seen from table 9, when the remaining samples are used for verification, the difference between the indexes of R, P, D in the three detection methods is within the specified threshold range, that is, after the traditional Chinese medicine data is separated from the total sample, the signal detection of the remaining data is less affected, and therefore, it is acceptable that the decision result is "need to perform classification detection on the total sample".

Claims (7)

1. A decision method for discriminating the classification detection of adverse drug reaction signals is characterized by comprising the following steps:
step 1), data acquisition and processing: wherein the data acquisition comprises acquisition of raw ADR data and acquisition of a standard library; processing data, namely deleting medicines and adverse reaction data thereof which are not included in a standard library from the original ADR data, deleting data with the ADR frequency less than 3 to obtain a total sample, and separating the total sample into a sub-sample and a residual sample;
step 2) signal detection of data: performing signal detection on the total sample and the subsample by an ADR signal detection method;
step 3) judging the effectiveness of the total sample and the sub-sample before and after separation by using a decision index based on a standard library; the recall ratio R, the precision ratio P and the difference ratio D are used as decision indexes, and the decision process is executed as follows:
step 301) design a four-grid table based on a standard library: labeling the detection result based on the total sample and the sub-sample by using a drug-adverse reaction combination in a standard library, wherein if the combination appears in the standard library, the combination is labeled as 1, and if not, the combination is labeled as 0;
the elements in the four-grid table are respectively a (a)1,a0)、b(b1,b0)、c(c1,c0) And d (d)1,d0) (ii) a Wherein, a represents the ADR combination number of positive signals detected by the total sample and the subsample; b represents the number of ADR combinations which are detected as positive signals in the total sample and negative signals in the subsample; c represents the number of ADR combinations which are detected as negative signals in the total sample and positive signals in the subsample; d represents the number of ADR combinations detected as negative signals in both the total and subsamples; a is1、b1、c1And d1The number of ADR combinations present in the standard library in a, b, c and d, respectively; a is0、b0、c0And d0The ADR combination numbers which are not present in the standard library in a, b, c and d are respectively represented; and a is a1+a0,b=b1+b0,c=c1+c0,d=d1+d0
Step 302) recall ratio comparison: setting the recall ratios of the total sample and the sub-sample as R respectively1And R2
Figure FDA0003163981110000011
Figure FDA0003163981110000012
If R is1>R2And R is1And R2If the difference is greater than the first preset threshold, the classification detection of the total sample is not needed; if R is2>R1And R is2-R1If the value of (1) is greater than a first preset threshold value, the total samples need to be classified and detected; if R is1And R2Is less than a first predetermined threshold value, thenPerforming step 303);
step 303) precision ratio comparison: setting the precision ratio of the total sample and the sub-sample as P1And P2
Figure FDA0003163981110000013
Figure FDA0003163981110000014
If P1>P2And P is1And P2If the difference is greater than the first preset threshold, the classification detection of the total sample is not needed; if P2>P1And P is2-P1If the value of (1) is greater than a first preset threshold value, the total samples need to be classified and detected; if P1And P2If the difference is smaller than the first preset threshold, go to step 304);
step 304) comparison of the difference rates: let the difference rate be D,
Figure FDA0003163981110000015
if D is larger than a second preset threshold value, the classification detection of the total samples is not needed; otherwise, the classification detection is carried out on the total samples.
2. The decision method for discriminating classification detection of ADRs as claimed in claim 1, wherein: the original ADR data is obtained from the national adverse drug reaction monitoring center; the standard library is an ADR known library established by a specification for collecting related medicines through a network, adverse reaction information reports of medicines at each stage issued by the national food and drug administration, a fast warning of medicines and various regulatory documents.
3. The decision method for discriminating classification detection of ADRs as claimed in claim 1, wherein: the decision method further comprises the step of verifying the classification test result, the residual samples and the total samples are tested under the condition that the decision result is that the total samples need to be classified and detected, if the value of the decision index R, P is smaller than a first preset threshold value and the value of D is smaller than a second preset threshold value, the result that the total samples need to be classified and detected is accepted, and otherwise, the result that the total samples need to be classified and detected is not accepted.
4. The decision method for discriminating classification detection of ADRs as claimed in claim 1, wherein: the total sample is divided into all data including three categories of traditional Chinese medicines, western medicines and biological products.
5. The decision method for discriminating classification detection of ADRs as claimed in claim 1, wherein: the subsamples are data of Chinese medicine categories; the residual sample is data of western medicine and biological product categories.
6. The decision method for discriminating classification detection of ADRs as claimed in claim 1, wherein: the ADR signal detection method in the step 2) is a PRR, MHRA, ROR and IC method based on an asymmetric measurement principle.
7. The decision method for discriminating classification detection of ADRs as claimed in claim 1, wherein: the first predetermined threshold is 2%; the second predetermined threshold is 10%.
CN201810275119.8A 2018-03-29 2018-03-29 Decision method for distinguishing classification detection of adverse drug reaction signals Active CN108831563B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810275119.8A CN108831563B (en) 2018-03-29 2018-03-29 Decision method for distinguishing classification detection of adverse drug reaction signals

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810275119.8A CN108831563B (en) 2018-03-29 2018-03-29 Decision method for distinguishing classification detection of adverse drug reaction signals

Publications (2)

Publication Number Publication Date
CN108831563A CN108831563A (en) 2018-11-16
CN108831563B true CN108831563B (en) 2021-09-28

Family

ID=64154308

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810275119.8A Active CN108831563B (en) 2018-03-29 2018-03-29 Decision method for distinguishing classification detection of adverse drug reaction signals

Country Status (1)

Country Link
CN (1) CN108831563B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110767319A (en) * 2019-09-30 2020-02-07 南京邮电大学 Method for detecting adverse reaction signals of combined medication
CN112133450B (en) * 2019-10-15 2022-08-23 南京邮电大学 Method for eliminating adverse drug reaction data shielding effect based on decision tree layering
CN110879822B (en) * 2019-11-15 2022-11-15 南京邮电大学 Drug adverse reaction signal detection method based on association rule analysis
CN114300159A (en) * 2021-12-29 2022-04-08 浙江太美医疗科技股份有限公司 Method, apparatus, device and medium for generating a medication alert signal
TWI812056B (en) * 2022-03-10 2023-08-11 宏碁股份有限公司 Method and electronic device of checking drug interaction

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105760698A (en) * 2016-03-18 2016-07-13 华中科技大学同济医学院附属协和医院 Adverse drug reaction early warning and analyzing system and method

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105760698A (en) * 2016-03-18 2016-07-13 华中科技大学同济医学院附属协和医院 Adverse drug reaction early warning and analyzing system and method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
信号检测在药品不良反应监测系统中的应用;候永芳等;《中国药物警戒》;20120930;第9卷(第9期);第539-541页 *
常用药品不良反应信号检测方法研究进展;陈友生等;《中国药物依赖性杂志》;20140415;第89-92页 *

Also Published As

Publication number Publication date
CN108831563A (en) 2018-11-16

Similar Documents

Publication Publication Date Title
CN108831563B (en) Decision method for distinguishing classification detection of adverse drug reaction signals
Kalina et al. Quality assessment program for E uro F low protocols: Summary results of four‐year (2010–2013) quality assurance rounds
CN110444248B (en) Cancer biomolecule marker screening method and system based on network topology parameters
Shang et al. Systematic review and meta-analysis of flow cytometry in urinary tract infection screening
JP2018505392A (en) Automated flow cytometry analysis method and system
CN111539308B (en) Embryo quality comprehensive evaluation device based on deep learning
Alldred et al. First and second trimester serum tests with and without first trimester ultrasound tests for Down's syndrome screening
CN107463800A (en) A kind of enteric microorganism information analysis method and system
CN109215799B (en) Screening method for false correlation signals in combined adverse drug reaction report data
CN111832389B (en) Counting and analyzing method of bone marrow cell morphology automatic detection system
CN114595956A (en) Eucalyptus soil fertility analysis method based on gray-scale correlation fuzzy clustering algorithm
CN112599250A (en) Postoperative data analysis method and device based on deep neural network
CN116864062B (en) Health physical examination report data analysis management system based on Internet
US20230386665A1 (en) Method and device for constructing autism spectrum disorder (asd) risk prediction model
CN108122059B (en) Production risk identification method and automatic early warning system for pharmaceutical manufacturing enterprise
CN108804494B (en) Method for minimizing data shielding effect in adverse drug reaction signal detection
US20230215571A1 (en) Automated classification of immunophenotypes represented in flow cytometry data
CN111986819A (en) Adverse drug reaction monitoring method and device, electronic equipment and readable storage medium
CN111833297B (en) Disease association method of marrow cell morphology automatic detection system
CN113469227A (en) Forced expiration total amount prediction method
CN113380318A (en) Artificial intelligence assisted flow cytometry 40CD immunophenotyping detection method and system
Chen et al. The added value of S-detect in the diagnostic accuracy of breast masses by senior and junior radiologist groups: a systematic review and meta-analysis
US20240038338A1 (en) System and method for automated flow cytometry data analysis and interpretation
CN116741384B (en) Bedside care-based severe acute pancreatitis clinical data management method
CN112133450B (en) Method for eliminating adverse drug reaction data shielding effect based on decision tree layering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant