Disclosure of Invention
An object of the present disclosure is to provide a medicine information difference processing method and a medicine information difference processing system, thereby overcoming, at least to some extent, one or more problems caused by the limitations and disadvantages of the related art.
According to an aspect of the present disclosure, there is provided a medicine information difference processing method including:
acquiring a matching pair consisting of drug record data and standard drug data;
calculating similarity values of the drug record data and reference items of the standard drug data in the matching pair based on drug information codes;
calculating the difference value of each reference item based on the similarity value of each reference item and the judgment threshold value corresponding to each reference item; and
and carrying out weighted average on the difference values of the reference items to obtain a comprehensive matching score of the matching pair.
In an exemplary embodiment of the present disclosure, calculating the difference value of each reference item based on the similarity value of each reference item and the judgment threshold corresponding to each reference item includes:
comparing the similarity value of the reference item with a judgment threshold corresponding to the reference item;
and obtaining the difference value of the reference item by combining the comparison result with the corresponding judgment threshold, the corresponding compatibility probability and the corresponding rejection probability and adopting an exponential amplification mode on the similar value.
In an exemplary embodiment of the present disclosure, comparing the similarity value of the reference item with the judgment threshold corresponding to the reference item includes:
and comparing the similarity value of the reference item with a result obtained after the judgment threshold value corresponding to the reference item is corrected by the correction factor.
In an exemplary embodiment of the present disclosure, obtaining a difference value of the reference item by performing an exponential amplification on the similarity value according to a comparison result in combination with a corresponding judgment threshold, a corresponding compatibility probability, and a corresponding rejection probability includes:
when the similarity value of the reference item is larger than the corresponding judgment threshold value, the similarity value is exponentially amplified through the following formula to obtain the difference value of each reference item:
difference value log (c/d, n) e
Wherein f is a judgment threshold, c is a compatibility probability, d is a repulsion probability, n is a difference factor number, and e is an amplification factor; and
when the similarity value of the reference item is smaller than the corresponding judgment threshold value, the similarity value is exponentially reduced through the following formula to obtain the difference value of each reference item:
difference value log ((1-c)/(1-d), n) e
Wherein f is a judgment threshold, c is a compatibility probability, d is an exclusion probability, n is the number of difference factors, and e is an amplification factor.
In an exemplary embodiment of the present disclosure, the medicine information code includes one or more of a character code, a font code, and a pronunciation code.
According to an aspect of the present disclosure, there is provided a medicine information difference processing system including:
a matching pair acquisition unit for acquiring a matching pair composed of the drug record data and the standard drug data;
a similarity value calculation unit for calculating a similarity value of each reference item of the drug record data and the standard drug data in the matching pair based on drug information coding;
the difference value calculating unit is used for calculating the difference value of each reference item based on the similarity value of each reference item and the judgment threshold value corresponding to each reference item; and
and the matching score calculating unit is used for carrying out weighted average on the difference values of the reference items to obtain the comprehensive matching score of the matching pair.
In an exemplary embodiment of the present disclosure, calculating the difference value of each reference item based on the similarity value of each reference item and the judgment threshold corresponding to each reference item includes:
comparing the similarity value of the reference item with a judgment threshold corresponding to the reference item;
and obtaining the difference value of the reference item by combining the comparison result with the corresponding judgment threshold, the corresponding compatibility probability and the corresponding rejection probability and adopting an exponential amplification mode on the similar value.
In an exemplary embodiment of the present disclosure, comparing the similarity value of the reference item with the judgment threshold corresponding to the reference item includes:
and comparing the similarity value of the reference item with a result obtained after the judgment threshold value corresponding to the reference item is corrected by the correction factor.
According to an aspect of the present disclosure, there is provided an electronic apparatus including
A processor; and
a memory having stored thereon a computer program that, when executed by the processor, implements any of the drug information difference processing methods described above.
According to an aspect of the present disclosure, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements a drug information difference processing method according to any one of the above.
In the drug information difference processing method and the drug information difference processing system in an exemplary embodiment of the present disclosure, the similarity value of each reference item in the acquired matching pair is calculated by the drug information code; calculating the difference value of each reference item based on the similarity value of each reference item and a judgment threshold value; and carrying out weighted average on the difference values of the reference items to obtain a comprehensive matching score of the matching pair. According to the medicine information difference processing system in the present exemplary embodiment, on one hand, the similarity value of each reference item in the obtained matching pair is calculated through the medicine information code, and the similarity value of each reference item in the matching pair can be accurately obtained; on the other hand, the difference value of each reference item is calculated based on the similarity value of each reference item and the judgment threshold, and the similarity value can be subjected to differential amplification processing according to the comparison result of the similarity value and the judgment threshold, so that the similarity value of each reference item can be effectively distinguished; on the other hand, the difference values of the reference items are weighted and averaged to obtain the comprehensive matching score of the matching pairs, so that the difference between the matching scores of the matching pairs can be obviously improved, the matching pairs can be effectively distinguished, the matching result is more accurate, and the matching efficiency is further improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The same reference numerals denote the same or similar parts in the drawings, and thus, a repetitive description thereof will be omitted.
Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the disclosure. One skilled in the relevant art will recognize, however, that the embodiments of the disclosure can be practiced without one or more of the specific details, or with other methods, components, materials, devices, steps, and so forth. In other instances, well-known structures, methods, devices, implementations, materials, or operations are not shown or described in detail to avoid obscuring aspects of the disclosure.
The block diagrams shown in the figures are functional entities only and do not necessarily correspond to physically separate entities. That is, these functional entities may be implemented in the form of software, or in one or more software-hardened modules, or in different networks and/or processor devices and/or microcontroller devices.
In the present exemplary embodiment, a drug information difference processing method is first provided. Referring to fig. 2, the medicine information difference processing method may include the steps of:
s210, acquiring a matching pair consisting of the drug record data and the standard drug data;
s220, calculating similarity values of the drug record data and reference items of the standard drug data in the matching pair based on drug information codes;
step S230, calculating difference values of the reference items based on the similarity values of the reference items and the judgment threshold values corresponding to the reference items; and
and S240, carrying out weighted average on the difference values of the reference items to obtain a comprehensive matching score of the matching pair.
According to the medicine information difference processing method in the present exemplary embodiment, on one hand, the similarity value of each reference item in the obtained matching pair is calculated through the medicine information code, and the similarity value of each reference item in the matching pair can be accurately obtained; on the other hand, the difference value of each reference item is calculated based on the similarity value of each reference item and the judgment threshold, and the similarity value can be subjected to differential amplification processing according to the comparison result of the similarity value and the judgment threshold, so that the similarity value of each reference item can be effectively distinguished; on the other hand, the difference values of the reference items are weighted and averaged to obtain the comprehensive matching score of the matching pairs, so that the difference between the matching scores of the matching pairs can be obviously improved, the matching pairs can be effectively distinguished, the matching result is more accurate, and the matching efficiency is further improved.
Next, the medicine information difference processing method in the present exemplary embodiment will be described in detail.
In step S110, a matching pair consisting of the drug record data and the standard drug data is acquired.
In the present exemplary embodiment, the drug record data may be acquired from a medical information system such as a HIS (hospital information system) or an EMR (electronic medical record) of a hospital, but in the exemplary embodiment of the present disclosure, the drug record data may also be acquired from a CIS (clinical information system) of a hospital or a drug information database of a hospital, which is not particularly limited by the present disclosure. In the present exemplary embodiment, the standard drug data may be standard drug data established according to "chinese pharmacopoeia", or may be standard drug data established according to other medical standards, for example, according to "united states pharmacopoeia", "european pharmacopoeia", "WHO international pharmacopoeia", traditional Chinese medicine patent drug preparations issued by the ministry of health "and" national Chinese patent drug standards compilation ", and the disclosure is not particularly limited thereto.
In the present exemplary embodiment, a unified interface may be designed for different types of databases in the hospital data system, such as MySQL, SQL Server, Oracle, DB2, and the like, and the drug record data and the standard drug data in each database may be acquired through the unified interface.
Further, in the present exemplary embodiment, the drug record data and the standard data each contain a plurality of reference items, each corresponding to a reference frame. Referring to fig. 3, in fig. 3, light areas indicate nonstandard drug record data to be processed, and dark areas indicate standard drug data. Each piece of drug record data in fig. 3 may include 7 reference items, each corresponding to a reference frame, for example, the reference frames corresponding to the 7 reference items in fig. 3 are: drug number, approved letter number, drug name (Chinese), drug name (English), formulation specification, drug formulation, manufacturer name. It should be noted that, although the above-mentioned 7 reference frames are shown in the present exemplary embodiment, the reference frames in the exemplary embodiment of the present disclosure are not limited thereto, and for example, the reference frames may also be reference frames of drug components, drug functions, storage modes, and the like, which is also within the protection scope of the present disclosure.
Next, in step S220, similarity values of the reference items of the drug record data and the standard drug data in the matching pair are calculated based on drug information codes.
In the present exemplary embodiment, the medicine information code may include one or more of a character code, a font code, and a pronunciation code, and in an exemplary embodiment of the present disclosure, the medicine information code may further include a font code, a phrase code, and the like, which are also within the scope of the present disclosure.
Further, in the present exemplary embodiment, the similarity value of each reference item in the matching pair may be calculated by editing the distance, but the exemplary embodiments of the present disclosure are not limited thereto, and for example, the similarity value of each reference item in the matching pair may also be calculated by an algorithm such as an N-Gram algorithm and a Soundex algorithm, which is not particularly limited by the present disclosure.
Specifically, table 1 below shows similarity values of the respective reference items calculated by the edit distance. In table 1 below, the list where the NID is located represents standard drug data, the list where the MID is located represents non-standard drug record data, and R1 to R6 represent different reference frames, which may be, for example, drug number, approved character number, drug name (chinese), drug name (english), formulation specification, and drug dosage form.
TABLE 1 similarity values for reference terms in matching pairs
NID
|
MID
|
R1
|
R2
|
R3
|
R4
|
R5
|
R6
|
N102
|
M185
|
91.06427
|
83.02689
|
55.76598
|
85.38511
|
42.87573
|
83.00242
|
N102
|
M423
|
67.88041
|
67.07605
|
89.16404
|
75.7294
|
75.68885
|
60.30778
|
N102
|
M902
|
86.84194
|
63.75839
|
88.71935
|
58.84218
|
66.24005
|
65.45528
|
N102
|
M580
|
76.63916
|
84.76704
|
62.67677
|
64.97011
|
72.45706
|
62.17329
|
N102
|
M1022
|
56.88255
|
93.85251
|
49.15134
|
91.16981
|
82.73499
|
49.01752
|
N102
|
M276
|
45.39508
|
49.52274
|
80.3929
|
39.7603
|
73.90614
|
40.16648
|
N102
|
M986
|
56.07566
|
47.0751
|
48.23597
|
55.15449
|
57.2735
|
56.68381
|
N102
|
M556
|
21.67067
|
50.90629
|
21.7255
|
57.97105
|
29.45841
|
32.19147
|
N102
|
M851
|
38.69177
|
22.67075
|
32.35278
|
36.74417
|
39.38383
|
34.78763
|
N102
|
M265
|
13.63181
|
75.12384
|
23.11756
|
60.24734
|
5.905576
|
24.16077
|
N102
|
M897
|
28.86901
|
84.51688
|
16.08536
|
16.22755
|
17.17475
|
24.42942 |
Next, in step S230, a difference value of each reference item is calculated based on the similarity value of each reference item and the judgment threshold corresponding to each reference item.
In the present exemplary embodiment, in order to better perform the distinctive processing on the similarity value of each reference item, the judgment threshold value of the similarity value of each reference system may be preset according to the importance degree of the content of each reference system, and furthermore, the judgment threshold value corresponding to each reference system may be corrected according to the processing result of the medicine information difference processing method of the present exemplary embodiment, for example, the judgment threshold value corresponding to the reference system may be corrected by setting a correction coefficient according to the processing result. After the judgment threshold of the similarity value of each reference system is preset, the similarity value of each reference item may be compared with the judgment threshold corresponding to the reference item, when the similarity value of the reference item is greater than the corresponding judgment threshold, the similarity value of the reference item is subjected to a positive amplification process, and when the similarity value of the reference item is less than the corresponding judgment threshold, the similarity value of the reference item is subjected to a negative amplification process.
For example, in the present exemplary embodiment, assuming that the judgment threshold of the reference item is f, the correction coefficient is h, the compatibility probability is c, the repulsion probability is d, the number of the difference factors is n, and the amplification coefficient is e, the similarity value of the reference item may be directly compared with the judgment threshold f corresponding to the reference item, or the similarity value of the reference item may be compared with the sum of the corresponding judgment threshold and the correction coefficient, for example, f + h, or the similarity value of the reference item may be compared with the value of the expression f + h (f + h), which is also within the protection scope of the present disclosure. Table 2 below shows preset values of the judgment threshold f, the correction coefficient h, the compatibility probability c, and the repulsion probability d corresponding to each reference frame according to the present exemplary embodiment:
reference item
|
Judging threshold value
|
Correction factor
|
Probability of compatibility
|
Probability of rejection
|
R1
|
53.05839
|
0.05031834
|
0.7703145
|
0.2933824
|
R2
|
65.66332
|
-0.1913443
|
0.8428367
|
0.0861728
|
R3
|
51.58069
|
-0.0240676
|
0.9510112
|
0.0806468
|
R4
|
58.38195
|
-0.0771713
|
0.9676309
|
0.1304032
|
R5
|
51.19081
|
-0.0369349
|
0.785452
|
0.2617031
|
R6
|
48.39781
|
0.03709961
|
0.9866528
|
0.3030836 |
It should be noted that the preset values are only examples in the present exemplary embodiment, and the preset values of the judgment threshold f, the correction coefficient h, the compatibility probability c, and the rejection probability d corresponding to each reference frame may also be modified according to the processing result, which is also within the protection scope of the present disclosure.
Further, after the similarity value of the reference item is compared with the corresponding judgment threshold, the difference value of the reference item can be obtained by combining the corresponding judgment threshold, the compatibility probability and the rejection probability according to the comparison result and adopting an exponential amplification mode for the similarity value. For example, when the similarity value of a reference item is greater than the corresponding judgment threshold, the similarity value is exponentially amplified by the following formula to obtain the difference value of each reference item:
difference value log (c/d, n) e (1)
Wherein f is a judgment threshold, c is a compatibility probability, d is a repulsion probability, n is a difference factor number, and e is an amplification factor; and
when the similarity value of the reference item is smaller than the corresponding judgment threshold value, the similarity value is exponentially reduced through the following formula to obtain the difference value of each reference item:
difference value log ((1-c)/(1-d), n) e (2)
Wherein f is a judgment threshold, c is a compatibility probability, d is an exclusion probability, n is the number of difference factors, and e is an amplification factor.
For example, table 3 below shows the difference values of the reference items obtained after exponentially amplifying the similarity values of the reference items. In table 3 below, the list where the NID is located represents standard drug data, the list where the MID is located represents non-standard drug record data, and R1 to R6 represent different reference frames, such as drug number, approved literature, drug name (chinese), drug name (english), formulation specification, and drug dosage form.
TABLE 3 difference values obtained after exponential amplification of the similarity values
NID
|
MID
|
R1
|
R2
|
R3
|
R4
|
R5
|
R6
|
N102
|
M185
|
126.8221
|
273.154268
|
198.5142
|
246.8891
|
-76.4432
|
141.3387
|
N102
|
M423
|
94.5347
|
220.676804
|
317.4037
|
218.9698
|
120.0116
|
102.6937
|
N102
|
M902
|
120.9418
|
209.761874
|
315.8207
|
170.1408
|
105.0297
|
111.459
|
N102
|
M580
|
106.7327
|
278.879278
|
223.115
|
187.8596
|
114.8873
|
105.8703
|
N102
|
M1022
|
79.21836
|
308.769993
|
-207.915
|
263.6154
|
131.1839
|
-279.712
|
N102
|
M276
|
-73.5977
|
-125.77079
|
286.1804
|
-188.768
|
117.185
|
-229.205
|
N102
|
M986
|
78.09464
|
-119.55462
|
-204.043
|
159.478
|
90.81239
|
96.52271
|
N102
|
M556
|
-35.134
|
-129.28452
|
-91.901
|
167.622
|
-52.5214
|
-183.697
|
N102
|
M851
|
-62.7298
|
-57.575931
|
-136.855
|
-174.449
|
-70.2174
|
-198.511
|
N102
|
M265
|
-22.1008
|
247.15363
|
-97.7895
|
174.2038
|
-10.5291
|
-137.87
|
N102
|
M897
|
-46.8045
|
278.056272
|
-68.0426
|
-77.0429
|
-30.6209
|
-139.403 |
Next, in step S240, the difference values of the reference items are weighted and averaged to obtain a comprehensive matching score of the matching pair.
In this exemplary embodiment, the average value of the difference values of the reference items may be directly calculated to obtain the comprehensive matching score of the matching pair, or the difference values of the reference items may be weighted and averaged to obtain the comprehensive matching score of the matching pair, and the weight of the difference value of each reference item may be determined according to the importance degree of the content of each reference system.
Specifically, table 4 below shows a comprehensive match score obtained by directly averaging the similarity values of the reference items and a comprehensive match score obtained by directly averaging the difference values of the reference items. In table 4 below, the list of NIDs indicates standard drug data, the list of MIDs indicates non-standard drug record data, and R1 to R6 indicate different reference systems, for example, the reference systems may be drug numbers, approved characters, drug names (chinese), drug names (english), formulation specifications, and drug dosage forms, AVG is the average value of the similarity values of the reference items, i.e., the comprehensive match score, and POW is the average value of the difference values of the reference items, i.e., the comprehensive match score. As can be seen from table 4, the difference between the comprehensive matching scores calculated by using the drug information difference processing method of the present exemplary embodiment is significantly enlarged, so that each matching pair can be effectively distinguished, the matching accuracy can be improved, and the matching efficiency is further improved.
TABLE 4 comparison of the match score POW calculated using the present exemplary embodiment with the match score AVG of a solution technique
NID
|
MID
|
R1
|
R2
|
R3
|
R4
|
R5
|
R6
|
AVG
|
POW
|
N102
|
M185
|
91.06427
|
83.02689
|
55.76598
|
85.38511
|
42.87573
|
83.00242
|
73.52007
|
151.7125
|
N102
|
M423
|
67.88041
|
67.07605
|
89.16404
|
75.7294
|
75.68885
|
60.30778
|
72.64109
|
179.0484
|
N102
|
M902
|
86.84194
|
63.75839
|
88.71935
|
58.84218
|
66.24005
|
65.45528
|
71.64287
|
172.1923
|
N102
|
M580
|
76.63916
|
84.76704
|
62.67677
|
64.97011
|
72.45706
|
62.17329
|
70.6139
|
169.5574
|
N102
|
M1022
|
56.88255
|
93.85251
|
49.15134
|
91.16981
|
82.73499
|
49.01752
|
70.46812
|
49.19343
|
N102
|
M276
|
45.39508
|
49.52274
|
80.3929
|
39.7603
|
73.90614
|
40.16648
|
54.85727
|
-35.6627
|
N102
|
M986
|
56.07566
|
47.0751
|
48.23597
|
55.15449
|
57.2735
|
56.68381
|
53.41642
|
16.88505
|
N102
|
M556
|
21.67067
|
50.90629
|
21.7255
|
57.97105
|
29.45841
|
32.19147
|
35.6539
|
-54.1526
|
N102
|
M851
|
38.69177
|
22.67075
|
32.35278
|
36.74417
|
39.38383
|
34.78763
|
34.10515
|
-116.723
|
N102
|
M265
|
13.63181
|
75.12384
|
23.11756
|
60.24734
|
5.905576
|
24.16077
|
33.69782
|
25.51128
|
N102
|
M897
|
28.86901
|
84.51688
|
16.08536
|
16.22755
|
17.17475
|
24.42942
|
31.21716
|
-13.9763 |
More intuitively, as shown in fig. 4, a relatively gentle curve in fig. 4 represents the average AVG of the similarity values of the reference items, and a relatively steep curve represents the average POW of the difference values of the reference items, so that with the medicine information difference processing method in the present exemplary embodiment, the difference between the matching scores of the matching pairs can be significantly increased, and the accuracy of matching can be improved.
Further, in the present exemplary embodiment, the matching pairs may also be sorted according to the magnitude of the obtained composite matching score of the matching pairs, so that the standard drug data with the highest matching score to the non-standard drug record data may be quickly obtained.
It should be noted that although the various steps of the methods of the present disclosure are depicted in the drawings in a particular order, this does not require or imply that these steps must be performed in this particular order, or that all of the depicted steps must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions, etc.
Further, in the present exemplary embodiment, there is also provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the drug information difference processing method according to any one of the above-described embodiments.
Further, in the present exemplary embodiment, a medicine information difference processing system is also provided. Referring to fig. 5, the medicine information difference processing system 500 may include: a matching pair acquisition unit 510, a similarity value calculation unit 520, a difference value calculation unit 530, and a matching score calculation unit 540. Wherein:
the matching pair obtaining unit 510 is configured to obtain a matching pair composed of the drug record data and the standard drug data;
the similarity value calculation unit 520 is configured to calculate similarity values of the reference items of the drug record data and the standard drug data in the matching pair based on drug information codes;
the difference value calculating unit 530 is configured to calculate a difference value of each reference item based on the similarity value of each reference item and the judgment threshold corresponding to each reference item; and
the matching score calculating unit 540 is configured to perform weighted average on the difference values of the reference items to obtain a comprehensive matching score of the matching pair.
Further, in this example embodiment, calculating the difference value of each reference item based on the similarity value of each reference item and the judgment threshold corresponding to each reference item may include: comparing the similarity value of the reference item with a judgment threshold corresponding to the reference item; and obtaining the difference value of the reference item by combining the comparison result with the corresponding judgment threshold, the corresponding compatibility probability and the corresponding rejection probability and adopting an exponential amplification mode on the similar value.
Further, in the present exemplary embodiment, comparing the similarity value of the reference item with the judgment threshold corresponding to the reference item may include: and comparing the similarity value of the reference item with a result obtained after the judgment threshold value corresponding to the reference item is corrected by the correction factor.
Since each functional module of the medicine information difference processing system 500 of the exemplary embodiment of the present disclosure corresponds to the step of the exemplary embodiment of the medicine information difference processing method described above, it is not described herein again.
Further, in the present exemplary embodiment, there is also provided an electronic apparatus including a processor; and a memory on which is stored a computer program that, when executed by the processor, implements the drug information difference processing method of any one of the above embodiments.
It should be noted that although several modules or units of the drug information difference processing system are mentioned in the above detailed description, such division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.
Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, a touch terminal, or a network device, etc.) to execute the method according to the embodiments of the present disclosure.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.