CN110648734B - Method and device for identifying abnormal cases in medical treatment based on mean value - Google Patents

Method and device for identifying abnormal cases in medical treatment based on mean value Download PDF

Info

Publication number
CN110648734B
CN110648734B CN201810681273.5A CN201810681273A CN110648734B CN 110648734 B CN110648734 B CN 110648734B CN 201810681273 A CN201810681273 A CN 201810681273A CN 110648734 B CN110648734 B CN 110648734B
Authority
CN
China
Prior art keywords
vector
case
feature vector
target
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810681273.5A
Other languages
Chinese (zh)
Other versions
CN110648734A (en
Inventor
金涛
魏志杰
王建民
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201810681273.5A priority Critical patent/CN110648734B/en
Publication of CN110648734A publication Critical patent/CN110648734A/en
Application granted granted Critical
Publication of CN110648734B publication Critical patent/CN110648734B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Data Mining & Analysis (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)

Abstract

The embodiment of the invention provides a method and a device for identifying abnormal cases in medical treatment based on a mean value, wherein the method comprises the following steps: acquiring a case of a specific disease, and representing the case according to a first feature vector; acquiring a first eigenvalue corresponding to each feature in the accumulated feature vector; determining a first target case quantity vector corresponding to a first target case according to the first characteristic value; calculating a first mean value point vector according to the accumulated feature vector and the first target case quantity vector; acquiring a first feature vector to be identified of each case, and calculating a first distance value between the first feature vector to be identified and the first mean point vector according to the first feature vector to be identified and the first mean point vector; and taking the second target case corresponding to the first target distance value larger than the first threshold value as an abnormal case. The device performs the above method. The method and the device provided by the embodiment of the invention can accurately, simply and conveniently identify the abnormal cases in medical treatment at low cost.

Description

Method and device for identifying abnormal cases in medical treatment based on mean value
Technical Field
The embodiment of the invention relates to the technical field of medical behavior identification, in particular to a method and a device for identifying abnormal cases in medical treatment based on a mean value.
Background
The staff such as doctors or nurses in medical institutions may induce patients to perform expensive treatment methods or treatment apparatuses which are not relevant to the treatment of their own diseases, such as drug abuse, prescription making, unnecessary examination item addition, and the like, which not only endangers the physical and mental health of patients, but also wastes the national medical resources, and defines these behaviors as abnormal cases in medical treatment, so the identification of the abnormal cases is very important.
With the arrival of the big data era, more and more medical data are recorded in the medical information system, and the prior art adopts methods such as data mining and the like to identify abnormal cases, but the adopted methods have the following defects: 1) some supervised or hybrid methods for identifying abnormal cases require labeled data, which is relatively costly in reality. 2) The calculation process is complex, the time consumption is large, and the method is not easy to understand by non-computer professionals such as doctors. 3) Most methods are characterized in that the original data are obtained after abstraction, and the influence of all items in the original data on the identification result of the abnormal case is not fully considered, namely the identification result is not accurate enough.
Therefore, it is an urgent problem to identify an abnormal case in medical treatment accurately, easily and at low cost by avoiding the above-mentioned drawbacks.
Disclosure of Invention
Aiming at the problems in the prior art, the embodiment of the invention provides a method and a device for identifying abnormal cases in medical treatment based on a mean value.
In a first aspect, an embodiment of the present invention provides a method for identifying an abnormal case in medical treatment based on a mean value, where the method includes:
acquiring a case of a specific disease, and representing the case according to a first feature vector; wherein the first feature vector is comprised of features associated with the particular disease;
accumulating the first eigenvectors of all cases to obtain accumulated eigenvectors;
acquiring a first characteristic value corresponding to each characteristic in the accumulated characteristic vector; determining a first target case quantity vector corresponding to a first target case according to the first characteristic value;
calculating a first mean value point vector according to the accumulated feature vector and the first target case quantity vector;
acquiring a first feature vector to be identified of each case, and calculating a first distance value between the first feature vector to be identified and the first mean point vector according to the first feature vector to be identified and the first mean point vector; and taking the second target case corresponding to the first target distance value larger than the first threshold value as an abnormal case.
In a second aspect, an embodiment of the present invention provides a mean-based apparatus for identifying an abnormal case in medical treatment, the apparatus including:
the acquiring unit is used for acquiring a case of a specific disease and expressing the case according to the first feature vector; wherein the first feature vector is comprised of features associated with the particular disease;
the accumulation unit is used for accumulating the first eigenvectors of all cases to obtain accumulated eigenvectors;
a determining unit, configured to obtain a first feature value corresponding to each feature in the accumulated feature vector; determining a first target case quantity vector corresponding to a first target case according to the first characteristic value;
the calculating unit is used for calculating a first mean value point vector according to the accumulated feature vector and the first target case quantity vector;
the identification unit is used for acquiring a first feature vector to be identified of each case and calculating a first distance value between the first feature vector to be identified and the first mean point vector according to the first feature vector to be identified and the first mean point vector; and taking the second target case corresponding to the first target distance value larger than the first threshold value as an abnormal case.
In a third aspect, an embodiment of the present invention provides an electronic device, including: a processor, a memory, and a bus, wherein,
the processor and the memory are communicated with each other through the bus;
the memory stores program instructions executable by the processor, the processor invoking the program instructions to perform a method comprising:
acquiring a case of a specific disease, and representing the case according to a first feature vector; wherein the first feature vector is comprised of features associated with the particular disease;
accumulating the first eigenvectors of all cases to obtain accumulated eigenvectors;
acquiring a first characteristic value corresponding to each characteristic in the accumulated characteristic vector; determining a first target case quantity vector corresponding to a first target case according to the first characteristic value;
calculating a first mean value point vector according to the accumulated feature vector and the first target case quantity vector;
acquiring a first feature vector to be identified of each case, and calculating a first distance value between the first feature vector to be identified and the first mean point vector according to the first feature vector to be identified and the first mean point vector; and taking the second target case corresponding to the first target distance value larger than the first threshold value as an abnormal case.
In a fourth aspect, an embodiment of the present invention provides a non-transitory computer-readable storage medium, including:
the non-transitory computer readable storage medium stores computer instructions that cause the computer to perform a method comprising:
acquiring a case of a specific disease, and representing the case according to a first feature vector; wherein the first feature vector is comprised of features associated with the particular disease;
accumulating the first eigenvectors of all cases to obtain accumulated eigenvectors;
acquiring a first characteristic value corresponding to each characteristic in the accumulated characteristic vector; determining a first target case quantity vector corresponding to a first target case according to the first characteristic value;
calculating a first mean value point vector according to the accumulated feature vector and the first target case quantity vector;
acquiring a first feature vector to be identified of each case, and calculating a first distance value between the first feature vector to be identified and the first mean point vector according to the first feature vector to be identified and the first mean point vector; and taking the second target case corresponding to the first target distance value larger than the first threshold value as an abnormal case.
According to the method and the device for identifying the abnormal cases in the medical treatment based on the mean value, provided by the embodiment of the invention, the first mean value point vector is calculated according to the accumulated feature vector and the first target case quantity vector, the first distance value between the first feature vector to be identified and the first mean value point vector is further calculated, the second target case corresponding to the first target distance value larger than the first threshold value is used as the abnormal case, and the abnormal cases in the medical treatment can be accurately, simply and conveniently identified at low cost.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a flowchart illustrating a method for identifying abnormal cases in medical treatment based on mean values according to an embodiment of the present invention;
FIG. 2 is a schematic structural diagram of an apparatus for identifying abnormal cases in medical treatment based on mean values according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a flowchart illustrating a method for identifying an abnormal case in medical treatment based on a mean value according to an embodiment of the present invention, and as shown in fig. 1, the method for identifying an abnormal case in medical treatment based on a mean value according to an embodiment of the present invention includes the following steps:
s101: acquiring a case of a specific disease, and representing the case according to a first feature vector; wherein the first feature vector is comprised of features associated with the particular disease.
Specifically, the device acquires a case of a specific disease, and represents the case according to a first feature vector; wherein the first feature vector is comprised of features associated with the particular disease. The acquisition of a specific disease can be achieved by disease-related fields in the medical information system, such as: the case of the cold needs to be obtained, the case of the cold can be queried by reading the disease query field related to the cold input by the user, and it needs to be explained that: in the embodiment of the invention, a single disease is considered as a specific disease, and complications (such as fever and pneumonia complicated with cold) are not considered as the single disease and need to be eliminated. The characteristics related to the specific disease may include diagnosis and treatment items related to the specific disease; the diagnosis and treatment items comprise non-drug diagnosis and treatment items (such as tests before diagnosis and the like) and drug diagnosis and treatment items (oral drugs, infusion and the like); correspondingly, the first characteristic value comprises a first non-drug characteristic value corresponding to the non-drug medical treatment item and a first drug characteristic value corresponding to the drug medical treatment item. It should be noted that: the number of items can be used directly as the first non-drug characteristic value for the first non-drug characteristic value; however, for the first drug characteristic value, the number of drugs needs to be converted to the same unit due to different drug specifications, and the converted number is used as the first drug characteristic value. It can be understood that: the value size of each first characteristic value can reflect the number of diagnosis and treatment items or the number of used medicines. The first characteristic value can be normalized, and the following steps are executed by adopting the normalized first characteristic value, so that even if the dimensions of different first characteristics are different, the values of different characteristics after normalization can be compared. The specific normalization method is a mature method in the field and is not described in detail.
S102: the first eigenvectors of all cases are accumulated to obtain an accumulated eigenvector.
Specifically, the device accumulates the first eigenvectors of all cases to obtain an accumulated eigenvector. Each case is mapped to a point in N-dimensional space (N may represent the total number of cases). Then, the mean point of the points formed for all cases was calculated as a reflection of the average dosage level for all cases. The mean point can be represented by a mean point vector, and the calculation steps can be explained from two angles as follows:
from the perspective of the case: accumulating the first eigenvectors of all cases to obtain an accumulated eigenvector v ═ v (v)0,v1...vm-1) (ii) a Where m represents the number of elements in the accumulated feature vector.
S103: acquiring a first characteristic value corresponding to each characteristic in the accumulated characteristic vector; and determining a first target case quantity vector corresponding to the first target case according to the first characteristic value.
Specifically, the device obtains a first eigenvalue corresponding to each feature in the accumulated eigenvector; and determining a first target case quantity vector corresponding to the first target case according to the first characteristic value. With reference to the above description: regarding the ith first feature value (i.e., the feature value from the angle of the case), the case corresponding to the non-zero ith first feature value is used as the 1 st-i target case corresponding to the ith feature, the number of the 1 st-i target cases is used as the ith element of the first target case number vector, so as to obtain a first target case number vector, and the number of the first target cases in the first target case number vector is NiI.e. the number of cases in which the medical item represented by the feature is used. It should be noted that, the 1 st-i target cases refer to: in the case where the ith characteristic value is not 0, that is, in all cases, the case of the diagnosis and treatment item corresponding to the ith characteristic is used.
When the first mean value point vector is obtained, each element in the accumulated feature vector corresponds to one case quantity. Each feature is actually a diagnosis item (a drug or an examination), and the first feature value of the feature is zero, which indicates that the case does not use the diagnosis item, whereas non-zero indicates that the case uses the corresponding diagnosis item. If the ith feature is glucose, there are 100 patients (i.e. 100 cases corresponding to 100 first feature vectors), and 5 of the patients use glucose (5 numbers are 2, 4, 3, 2, 4 respectively), then the mean value of the feature is (95 × 0+2+4+3+2+4)/5 instead of (95 × 0+2+4+3+2+4)/100, i.e. when the mean value of the ith feature is obtained, the target case is the case using the feature, and the target case is different for each feature.
S104: and calculating a first mean point vector according to the accumulated feature vector and the first target case quantity vector.
Specifically, the device calculates a first mean point vector according to the accumulated feature vector and the first target case quantity vector. Each element in the first mean point vector (i.e., the mean point vector from the perspective of the case) may be calculated according to the following formula:
Figure BDA0001710950810000071
wherein the content of the first and second substances,
Figure BDA0001710950810000072
is the first mean point vector
Figure BDA0001710950810000073
I is more than or equal to 0 and less than m; v. ofiIs that said accumulated eigenvector v ═ v (v)0,v1...vm-1) One element of (1); n is a radical ofiIs the first target number of cases in the first target number of cases vector.
S105: acquiring a first feature vector to be identified of each case, and calculating a first distance value between the first feature vector to be identified and the first mean point vector according to the first feature vector to be identified and the first mean point vector; and taking the second target case corresponding to the first target distance value larger than the first threshold value as an abnormal case.
Specifically, the device obtains a first feature vector to be identified of each case, and calculates a first distance value between the first feature vector to be identified and the first mean point vector according to the first feature vector to be identified and the first mean point vector; and taking the second target case corresponding to the first target distance value larger than the first threshold value as an abnormal case. The first feature vector to be recognized (i.e. the feature vector to be recognized from the perspective of a case, which may be x ═ x) may be calculated according to the following formula0,x1...xm-1) Representation) and the first mean point vector:
Figure BDA0001710950810000074
wherein, xviIs the first difference vector xv ═ (xv)0,xv1...xvm-1) I is more than or equal to 0 and less than m; x is the number ofiIs the first to-be-identified feature vector x ═ x (x)0,x1...xm-1) One element of (1);
Figure BDA0001710950810000075
is the first mean point vector
Figure BDA0001710950810000076
One element of (1);
if one element xv in said xviIs less than zero, then an xv less than zero is assignediThe value of (a) is zero, i.e. only the features of the usage above the average are considered, the results obtained are expressed as follows:
xv′=(xv0′,xv1′...xv′m-1)
calculating the first distance value according to the following formula:
Figure BDA0001710950810000077
wherein the content of the first and second substances,
Figure BDA0001710950810000078
is the first distance value, xvi'the first difference vector xv' given a zero value is (xv)0′,xv1′...xv′m-1) One element of (1).
Namely: and calculating the distance between the point corresponding to each case and the mean value point of the case, and reflecting the overall abnormal degree of the case. The first threshold value can be set autonomously according to the actual situation, or can be sorted from high to low according to the abnormal degree (i.e. the first distances are arranged from large to small), so as to obtain the cases with the top rank, which are used as abnormal cases, and further used as the reference basis for judging medical fraud behaviors. And taking the characteristics corresponding to the elements with larger values in the first difference value vector after the zero value is given as the abnormal reasons of the abnormal cases (for example, the usage amount of a certain drug is far higher than the average level). The first distance is calculated based on the modified euclidean distance, which is specifically a mature technology in the field and is not described again.
The following description is made from the perspective of a case corresponding to each doctor:
because a plurality of cases may belong to the same doctor, after all the cases are clustered according to the doctor, a feature vector (second feature vector) of each doctor is obtained, that is, the feature vectors of all the cases belonging to the doctor are added and averaged, and the obtained average vector is the feature vector of each doctor. Each doctor is then mapped to a point in M-dimensional space, and the mean of all doctor-formed points is found as a reflection of the average dosage level for all doctors.
Acquiring doctor identifiers corresponding to the cases, and acquiring a second feature vector of the case corresponding to each doctor according to the doctor identifiers; wherein the second feature vector is comprised of features associated with the particular disease. The doctor identification may include a doctor name, an ID, etc., without being particularly limited.
The first of all cases corresponding to each doctorThe two feature vectors are added and averaged to obtain an average feature vector d (d) corresponding to each doctor0,d1...dm-1);
The third target case is the case corresponding to all doctors, namely each doctor corresponds to one case; acquiring a second characteristic value corresponding to each characteristic in the average characteristic vector; and determining a third target case quantity vector according to the second characteristic value. Taking the third target case number in the third target case number vector as DNiI.e. the number of doctors using the medical item represented by the feature.
Calculating a second mean value point vector according to the mean feature vector and the third target case quantity vector; each element in the second mean point vector may be calculated according to the following formula:
Figure BDA0001710950810000081
wherein the content of the first and second substances,
Figure BDA0001710950810000082
is the second mean point vector
Figure BDA0001710950810000083
I is more than or equal to 0 and less than m; diIs the mean feature vector
Figure BDA0001710950810000084
One element of (1); DNiIs the third target number of cases in the third target number of cases vector.
Acquiring a second feature vector to be identified of a case corresponding to each doctor, and calculating a second distance value between the second feature vector to be identified and the second mean point vector according to the second feature vector to be identified and the second mean point vector; and taking the fourth target case corresponding to the second target distance value larger than the second threshold value as the abnormal case corresponding to each doctor. A second difference vector between the second feature vector to be recognized and the second mean point vector may be calculated according to the following formula:
Figure BDA0001710950810000091
wherein, ydiIs said second difference vector yd ═ yd (yd)0,yd1...ydm-1) I is more than or equal to 0 and less than m; y isiIs that the second feature vector to be identified is (y ═ y)0,y1...ym-1) One element of (1);
Figure BDA0001710950810000092
is the second mean point vector
Figure BDA0001710950810000093
One element of (1);
if an element yd in yd is presentiIs less than zero, then a value of yd less than zero is assignediIs zero;
calculating the second distance value according to the following formula:
Figure BDA0001710950810000094
wherein the content of the first and second substances,
Figure BDA0001710950810000095
is the second distance value, ydi'is the second difference vector yd' given a value of zero (yd ═ yd0′,yd1′...ydm-1') of the group.
Namely: and calculating the distance between the point corresponding to each doctor and the average value point of the doctors, and reflecting the overall abnormal degree of the doctors. The second threshold value can be set autonomously according to the actual situation, or can be sorted from high to low according to the abnormal degree (i.e. the second distances are arranged in the order from large to small), so as to obtain the doctor ranked in the front, which is used as the abnormal case corresponding to the doctor, and further used as the reference basis for judging the medical fraud behavior. And taking the characteristics corresponding to the elements with larger values in the second difference vector after the zero value is given as the abnormal reasons of the abnormal cases (for example, the usage amount of a certain drug is far higher than the average level). The second distance is calculated based on the modified euclidean distance, which is specifically a mature technology in the field and is not described again.
According to the method for identifying the abnormal cases in the medical treatment based on the mean value, provided by the embodiment of the invention, the first mean value point vector is calculated according to the accumulated feature vector and the first target case quantity vector, then, the first distance value between the first feature vector to be identified and the first mean value point vector is calculated, and the second target case corresponding to the first target distance value larger than the first threshold value is taken as the abnormal case, so that the abnormal cases in the medical treatment can be identified accurately, simply and conveniently at low cost.
On the basis of the foregoing embodiment, the determining a first target case quantity vector corresponding to a first target case according to the first feature value includes:
the first target cases are cases corresponding to all specific diseases; and taking the case corresponding to the non-zero ith first characteristic value as a 1-i target case corresponding to the ith characteristic, and taking the number of the 1-i target cases as the ith element of the first target case number vector, thereby obtaining the first target case number vector.
Specifically, the first target cases in the device are cases corresponding to all specific diseases; and taking the case corresponding to the non-zero ith first characteristic value as a 1-i target case corresponding to the ith characteristic, and taking the number of the 1-i target cases as the ith element of the first target case number vector, thereby obtaining the first target case number vector. Reference may be made to the above embodiments, which are not described in detail.
According to the method for identifying the abnormal cases in the medical treatment based on the mean value, provided by the embodiment of the invention, the calculated amount in the process of identifying the abnormal cases in the medical treatment can be further simplified by reasonably determining the number vector of the first target cases.
On the basis of the foregoing embodiment, the calculating a first mean point vector according to the accumulated feature vector and the first target number of cases includes:
calculating each element in the first mean point vector according to the following formula:
Figure BDA0001710950810000101
wherein the content of the first and second substances,
Figure BDA0001710950810000102
is the first mean point vector
Figure BDA0001710950810000103
I is more than or equal to 0 and less than m; v. ofiIs that said accumulated eigenvector v ═ v (v)0,v1...vm-1) One element of (1); n is a radical ofiIs the first target number of cases in the first target number of cases vector.
Specifically, the apparatus calculates each element in the first mean point vector according to the following formula:
Figure BDA0001710950810000104
wherein the content of the first and second substances,
Figure BDA0001710950810000105
is the first mean point vector
Figure BDA0001710950810000106
I is more than or equal to 0 and less than m; v. ofiIs that said accumulated eigenvector v ═ v (v)0,v1...vm-1) One element of (1); n is a radical ofiIs the first target number of cases in the first target number of cases vector. Reference may be made to the above embodiments, which are not described in detail.
The method for identifying the abnormal cases in the medical treatment based on the mean value provided by the embodiment of the invention ensures the normal operation of the method based on the mean value by reasonably calculating the first mean value point vector.
On the basis of the foregoing embodiment, the calculating a first distance value between the first to-be-identified feature vector and the first mean point vector according to the first to-be-identified feature vector and the first mean point vector includes:
calculating a first difference vector between the first to-be-identified feature vector and the first mean point vector according to the following formula:
Figure BDA0001710950810000111
wherein, xviIs the first difference vector xv ═ (xv)0,xv1...xvm-1) I is more than or equal to 0 and less than m; x is the number ofiIs the first to-be-identified feature vector x ═ x (x)0,x1...xm-1) One element of (1);
Figure BDA0001710950810000112
is the first mean point vector
Figure BDA0001710950810000113
One element of (1).
Specifically, the device calculates a first difference vector between the first to-be-identified feature vector and the first mean point vector according to the following formula:
Figure BDA0001710950810000114
wherein, xviIs the first difference vector xv ═ (xv)0,xv1...xvm-1) I is more than or equal to 0 and less than m; x is the number ofiIs the first to-be-identified feature vector x ═ x (x)0,x1...xm-1) One element of (1);
Figure BDA0001710950810000115
is the first mean point vector
Figure BDA0001710950810000116
One element of (1). Reference may be made to the above embodiments, which are not described in detail.
If one element xv in said xviIs less than zero, then an xv less than zero is assignediIs zero; xviAfter the above treatment, xv is obtainedi′。
Specifically, the device knows one element xv in the xv if judgingiIs less than zero, then an xv less than zero is assignediIs zero; xviAfter the above treatment, xv is obtainedi'. Reference may be made to the above embodiments, which are not described in detail.
Calculating the first distance value according to the following formula:
Figure BDA0001710950810000117
wherein the content of the first and second substances,
Figure BDA0001710950810000118
is the first distance value, xvi'the first difference vector xv' given a zero value is (xv)0′,xv1′...xv′m-1) One element of (1).
Specifically, the device calculates the first distance value according to the following formula:
Figure BDA0001710950810000121
wherein the content of the first and second substances,
Figure BDA0001710950810000122
is the first distance value, xvi'the first difference vector xv' given a zero value is (xv)0′,xv1′...xv′m-1) One element of (1). Reference may be made to the above embodiments, which are not described in detail.
The method for identifying the abnormal case in the medical treatment based on the mean value provided by the embodiment of the invention ensures the normal operation of the method based on the Euclidean distance through the first distance value which is reasonably calculated.
On the basis of the above embodiment, the feature related to the specific disease includes a diagnosis and treatment item related to the specific disease; the diagnosis and treatment items comprise non-drug diagnosis and treatment items and drug diagnosis and treatment items; correspondingly, the first characteristic value comprises a first non-drug characteristic value corresponding to the non-drug medical treatment item and a first drug characteristic value corresponding to the drug medical treatment item.
Specifically, the first characteristic value in the device includes a first non-drug characteristic value corresponding to the non-drug diagnosis and treatment item and a first drug characteristic value corresponding to the drug diagnosis and treatment item. Reference may be made to the above embodiments, which are not described in detail.
According to the method for identifying the abnormal cases in the medical treatment based on the mean value, provided by the embodiment of the invention, the abnormal cases in the medical treatment can be more comprehensively identified by taking the non-drug diagnosis and treatment items and the drug diagnosis and treatment items as the characteristic values.
On the basis of the above embodiment, the method further includes performing normalization processing on the first feature value, and executing the above method by using the first feature value after the normalization processing.
Specifically, the device normalizes the first characteristic value, and executes the method by using the normalized first characteristic value. Reference may be made to the above embodiments, which are not described in detail.
According to the method for identifying the abnormal case in the medical treatment based on the mean value, provided by the embodiment of the invention, the normalization processing is carried out on the first characteristic value, so that the values of different characteristics after the normalization processing have comparability.
On the basis of the above embodiment, the method further includes:
acquiring doctor identifiers corresponding to the cases, and acquiring a second feature vector of the case corresponding to each doctor according to the doctor identifiers; wherein the second feature vector is comprised of features associated with the particular disease.
Specifically, the device acquires doctor identifiers corresponding to the cases, and acquires a second feature vector of each case corresponding to each doctor according to the doctor identifiers; wherein the second feature vector is comprised of features associated with the particular disease. Reference may be made to the above embodiments, which are not described in detail.
And respectively adding the second feature vectors of all cases corresponding to each doctor and averaging to obtain an average feature vector corresponding to each doctor.
Specifically, the device adds the second feature vectors of all cases corresponding to each doctor and averages the second feature vectors to obtain an average feature vector corresponding to each doctor. Reference may be made to the above embodiments, which are not described in detail.
The third target case is the case corresponding to all doctors; acquiring a second characteristic value corresponding to each characteristic in the average characteristic vector; and determining a third target case quantity vector according to the second characteristic value.
Specifically, the third target cases in the device are cases corresponding to all doctors; acquiring a second characteristic value corresponding to each characteristic in the average characteristic vector; and determining a third target case quantity vector according to the second characteristic value. Reference may be made to the above embodiments, which are not described in detail.
And calculating a second mean point vector according to the mean feature vector and the third target case quantity vector.
Specifically, the device calculates a second mean point vector according to the average feature vector and the third target case quantity vector. Reference may be made to the above embodiments, which are not described in detail.
Acquiring a second feature vector to be identified of a case corresponding to each doctor, and calculating a second distance value between the second feature vector to be identified and the second mean point vector according to the second feature vector to be identified and the second mean point vector; and taking the fourth target case corresponding to the second target distance value larger than the second threshold value as the abnormal case corresponding to each doctor.
Specifically, the device acquires a second feature vector to be identified of a case corresponding to each doctor, and calculates a second distance value between the second feature vector to be identified and the second mean point vector according to the second feature vector to be identified and the second mean point vector; and taking the fourth target case corresponding to the second target distance value larger than the second threshold value as the abnormal case corresponding to each doctor. Reference may be made to the above embodiments, which are not described in detail.
According to the method for identifying the abnormal cases in the medical treatment based on the mean value, provided by the embodiment of the invention, the second mean value point vector is calculated according to the mean characteristic vector and the third target case quantity vector, then, the second distance value between the second characteristic vector to be identified and the second mean value point vector is calculated, the fourth target case corresponding to the second target distance value larger than the second threshold value is taken as the abnormal case, and the abnormal case corresponding to each doctor can be identified accurately, simply and conveniently at low cost.
Fig. 2 is a schematic structural diagram of an apparatus for identifying an abnormal case in medical treatment based on a mean value according to an embodiment of the present invention, and as shown in fig. 2, an apparatus for identifying an abnormal case in medical treatment based on a mean value according to an embodiment of the present invention includes an obtaining unit 201, an accumulating unit 202, a determining unit 203, a calculating unit 204, and an identifying unit 205, where:
the obtaining unit 201 is configured to obtain a case of a specific disease, which is represented according to a first feature vector; wherein the first feature vector is comprised of features associated with the particular disease; the accumulating unit 202 is configured to accumulate the first eigenvectors of all cases to obtain an accumulated eigenvector; the determining unit 203 is configured to obtain a first feature value corresponding to each feature in the accumulated feature vector; determining a first target case quantity vector corresponding to a first target case according to the first characteristic value; the calculating unit 204 is configured to calculate a first mean point vector according to the accumulated feature vector and the first target case quantity vector; the identifying unit 205 is configured to obtain a first to-be-identified feature vector of each case, and calculate a first distance value between the first to-be-identified feature vector and the first mean point vector according to the first to-be-identified feature vector and the first mean point vector; and taking the second target case corresponding to the first target distance value larger than the first threshold value as an abnormal case.
Specifically, the obtaining unit 201 is configured to obtain a case of a specific disease, and represent the case according to a first feature vector; wherein the first feature vector is comprised of features associated with the particular disease; the accumulating unit 202 is configured to accumulate the first eigenvectors of all cases to obtain an accumulated eigenvector; the determining unit 203 is configured to obtain a first feature value corresponding to each feature in the accumulated feature vector; determining a first target case quantity vector corresponding to a first target case according to the first characteristic value; the calculating unit 204 is configured to calculate a first mean point vector according to the accumulated feature vector and the first target case quantity vector; the identifying unit 205 is configured to obtain a first to-be-identified feature vector of each case, and calculate a first distance value between the first to-be-identified feature vector and the first mean point vector according to the first to-be-identified feature vector and the first mean point vector; and taking the second target case corresponding to the first target distance value larger than the first threshold value as an abnormal case.
The device for identifying the abnormal cases in the medical treatment based on the mean value, provided by the embodiment of the invention, calculates the first mean value point vector according to the accumulated feature vector and the first target case quantity vector, further calculates the first distance value between the first feature vector to be identified and the first mean value point vector, and takes the second target case corresponding to the first target distance value larger than the first threshold value as the abnormal case, so that the abnormal cases in the medical treatment can be identified accurately, simply and conveniently at low cost.
The device for identifying an abnormal case in medical treatment based on a mean value provided by the embodiment of the present invention can be specifically used for executing the processing flow of each method embodiment, and the functions thereof are not described herein again, and reference can be made to the detailed description of the method embodiments.
Fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 3, the electronic device includes: a processor (processor)301, a memory (memory)302, and a bus 303;
the processor 301 and the memory 302 complete communication with each other through a bus 303;
the processor 301 is configured to call program instructions in the memory 302 to perform the methods provided by the above-mentioned method embodiments, including: acquiring a case of a specific disease, and representing the case according to a first feature vector; wherein the first feature vector is comprised of features associated with the particular disease; accumulating the first eigenvectors of all cases to obtain accumulated eigenvectors; acquiring a first characteristic value corresponding to each characteristic in the accumulated characteristic vector; determining a first target case quantity vector corresponding to a first target case according to the first characteristic value; calculating a first mean value point vector according to the accumulated feature vector and the first target case quantity vector; acquiring a first feature vector to be identified of each case, and calculating a first distance value between the first feature vector to be identified and the first mean point vector according to the first feature vector to be identified and the first mean point vector; and taking the second target case corresponding to the first target distance value larger than the first threshold value as an abnormal case.
The present embodiment discloses a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the method provided by the above-mentioned method embodiments, for example, comprising: acquiring a case of a specific disease, and representing the case according to a first feature vector; wherein the first feature vector is comprised of features associated with the particular disease; accumulating the first eigenvectors of all cases to obtain accumulated eigenvectors; acquiring a first characteristic value corresponding to each characteristic in the accumulated characteristic vector; determining a first target case quantity vector corresponding to a first target case according to the first characteristic value; calculating a first mean value point vector according to the accumulated feature vector and the first target case quantity vector; acquiring a first feature vector to be identified of each case, and calculating a first distance value between the first feature vector to be identified and the first mean point vector according to the first feature vector to be identified and the first mean point vector; and taking the second target case corresponding to the first target distance value larger than the first threshold value as an abnormal case.
The present embodiments provide a non-transitory computer-readable storage medium storing computer instructions that cause the computer to perform the methods provided by the above method embodiments, for example, including: acquiring a case of a specific disease, and representing the case according to a first feature vector; wherein the first feature vector is comprised of features associated with the particular disease; accumulating the first eigenvectors of all cases to obtain accumulated eigenvectors; acquiring a first characteristic value corresponding to each characteristic in the accumulated characteristic vector; determining a first target case quantity vector corresponding to a first target case according to the first characteristic value; calculating a first mean value point vector according to the accumulated feature vector and the first target case quantity vector; acquiring a first feature vector to be identified of each case, and calculating a first distance value between the first feature vector to be identified and the first mean point vector according to the first feature vector to be identified and the first mean point vector; and taking the second target case corresponding to the first target distance value larger than the first threshold value as an abnormal case.
Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
The above-described embodiments of the electronic device and the like are merely illustrative, where the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may also be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the embodiments of the present invention, and are not limited thereto; although embodiments of the present invention have been described in detail with reference to the foregoing embodiments, those skilled in the art will understand that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. A mean-based method for identifying abnormal cases in medical treatment, comprising:
acquiring a case of a specific disease, and representing the case according to a first feature vector; wherein the first feature vector is comprised of features associated with the particular disease;
accumulating the first eigenvectors of all cases to obtain accumulated eigenvectors;
acquiring a first characteristic value corresponding to each characteristic in the accumulated characteristic vector; determining a first target case quantity vector corresponding to a first target case according to the first characteristic value;
calculating a first mean value point vector according to the accumulated feature vector and the first target case quantity vector;
acquiring a first feature vector to be identified of each case, and calculating a first distance value between the first feature vector to be identified and the first mean point vector according to the first feature vector to be identified and the first mean point vector; and taking the second target case corresponding to the first target distance value larger than the first threshold value as an abnormal case.
2. The method of claim 1, wherein determining a first target case quantity vector corresponding to a first target case according to the first eigenvalue comprises:
the first target cases are cases corresponding to all specific diseases; and taking the case corresponding to the non-zero ith first characteristic value as a 1-i target case corresponding to the ith characteristic, and taking the number of the 1-i target cases as the ith element of the first target case number vector, thereby obtaining the first target case number vector.
3. The method of claim 1, wherein calculating a first mean point vector based on the accumulated feature vector and the first target number of cases comprises:
calculating each element in the first mean point vector according to the following formula:
Figure FDA0003474829950000011
wherein the content of the first and second substances,
Figure FDA0003474829950000012
is the first mean point vector
Figure FDA0003474829950000013
I is more than or equal to 0 and less than m, and m represents the number of elements in the accumulated characteristic vector; v. ofiIs that said accumulated eigenvector v ═ v (v)0,v1...vm-1) One element of (1); n is a radical ofiIs the first target number of cases in the first target number of cases vector.
4. The method of claim 1, wherein the calculating a first distance value between the first to-be-identified feature vector and the first mean point vector according to the first to-be-identified feature vector and the first mean point vector comprises:
calculating a first difference vector between the first to-be-identified feature vector and the first mean point vector according to the following formula:
Figure FDA0003474829950000021
wherein, xviIs the first difference vector xv ═ (xv)0,xv1...xvm-1) I is more than or equal to 0 and less than m, and m represents the number of elements in the accumulated feature vector; x is the number ofiIs the first to-be-identified feature vector x ═ x (x)0,x1...xm-1) One element of (1);
Figure FDA0003474829950000022
is the first mean point vector
Figure FDA0003474829950000023
One element of (1);
if one element xv in said xviIs less than zero, then an xv less than zero is assignediIs zero; xviAfter the above treatment, xv is obtainedi′;
Calculating the first distance value according to the following formula:
Figure FDA0003474829950000024
wherein the content of the first and second substances,
Figure FDA0003474829950000025
is the first distance value, xvi' is a first difference vector xv ' ═ xv ' (xv ') given a value of zero '0,xv′1...xv′m-1) One element of (1).
5. The method of claim 1, wherein the characteristics related to the specific disease include medical items related to the specific disease; the diagnosis and treatment items comprise non-drug diagnosis and treatment items and drug diagnosis and treatment items; correspondingly, the first characteristic value comprises a first non-drug characteristic value corresponding to the non-drug medical treatment item and a first drug characteristic value corresponding to the drug medical treatment item.
6. The method of claim 1, further comprising normalizing the first feature value and performing the method of claim 1 using the normalized first feature value.
7. The method of any of claims 1 to 6, further comprising:
acquiring doctor identifiers corresponding to the cases, and acquiring a second feature vector of the case corresponding to each doctor according to the doctor identifiers; wherein the second feature vector is comprised of features associated with the particular disease;
adding the second characteristic vectors of all cases corresponding to each doctor respectively, and averaging to obtain an average characteristic vector corresponding to each doctor;
the third target case is the case corresponding to all doctors; acquiring a second characteristic value corresponding to each characteristic in the average characteristic vector; determining a third target case quantity vector according to the second characteristic value;
calculating a second mean value point vector according to the mean feature vector and the third target case quantity vector;
acquiring a second feature vector to be identified of a case corresponding to each doctor, and calculating a second distance value between the second feature vector to be identified and the second mean point vector according to the second feature vector to be identified and the second mean point vector; and taking the fourth target case corresponding to the second target distance value larger than the second threshold value as the abnormal case corresponding to each doctor.
8. An apparatus for mean-based identification of abnormal cases in medical treatment, comprising:
the acquiring unit is used for acquiring a case of a specific disease and expressing the case according to the first feature vector; wherein the first feature vector is comprised of features associated with the particular disease;
the accumulation unit is used for accumulating the first eigenvectors of all cases to obtain accumulated eigenvectors;
a determining unit, configured to obtain a first feature value corresponding to each feature in the accumulated feature vector; determining a first target case quantity vector corresponding to a first target case according to the first characteristic value;
the calculating unit is used for calculating a first mean value point vector according to the accumulated feature vector and the first target case quantity vector;
the identification unit is used for acquiring a first feature vector to be identified of each case and calculating a first distance value between the first feature vector to be identified and the first mean point vector according to the first feature vector to be identified and the first mean point vector; and taking the second target case corresponding to the first target distance value larger than the first threshold value as an abnormal case.
9. An electronic device, comprising: a processor, a memory, and a bus, wherein,
the processor and the memory are communicated with each other through the bus;
the memory stores program instructions executable by the processor, the processor invoking the program instructions to perform the method of any of claims 1 to 7.
10. A non-transitory computer-readable storage medium storing computer instructions that cause a computer to perform the method of any one of claims 1 to 7.
CN201810681273.5A 2018-06-27 2018-06-27 Method and device for identifying abnormal cases in medical treatment based on mean value Active CN110648734B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810681273.5A CN110648734B (en) 2018-06-27 2018-06-27 Method and device for identifying abnormal cases in medical treatment based on mean value

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810681273.5A CN110648734B (en) 2018-06-27 2018-06-27 Method and device for identifying abnormal cases in medical treatment based on mean value

Publications (2)

Publication Number Publication Date
CN110648734A CN110648734A (en) 2020-01-03
CN110648734B true CN110648734B (en) 2022-04-22

Family

ID=68989109

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810681273.5A Active CN110648734B (en) 2018-06-27 2018-06-27 Method and device for identifying abnormal cases in medical treatment based on mean value

Country Status (1)

Country Link
CN (1) CN110648734B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113822365B (en) * 2021-09-28 2023-09-05 北京恒生芸泰网络科技有限公司 Medical data storage and big data mining method and system based on block chain technology

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105159948A (en) * 2015-08-12 2015-12-16 成都数联易康科技有限公司 Medical insurance fraud detection method based on multiple features
CN106874658A (en) * 2017-01-18 2017-06-20 天津艾登科技有限公司 A kind of medical insurance fraud recognition methods based on Principal Component Analysis Algorithm

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB201012519D0 (en) * 2010-07-26 2010-09-08 Ucl Business Plc Method and system for anomaly detection in data sets

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105159948A (en) * 2015-08-12 2015-12-16 成都数联易康科技有限公司 Medical insurance fraud detection method based on multiple features
CN106874658A (en) * 2017-01-18 2017-06-20 天津艾登科技有限公司 A kind of medical insurance fraud recognition methods based on Principal Component Analysis Algorithm

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
基于编辑距离的序列聚类算法及其在临床异常检测中的应用;孙启航;《中国优秀博硕士学位论文全文数据库(硕士)医药卫生科技辑》;20180115(第01期);E053-159 *
异常医疗行为识别研究;李学沧等;《智慧健康》;20151120(第02期);46-52 *
高维数据挖掘技术在临床异常现象中的研究;杨鹤标等;《计算机工程与设计》;20131116;第34卷(第11期);345-349 *

Also Published As

Publication number Publication date
CN110648734A (en) 2020-01-03

Similar Documents

Publication Publication Date Title
Aldahiri et al. Trends in using IoT with machine learning in health prediction system
US11670415B2 (en) Data driven analysis, modeling, and semi-supervised machine learning for qualitative and quantitative determinations
CN111696675B (en) User data classification method and device based on Internet of things data and computer equipment
CN101911078B (en) Coupling similar patient case
CN108630322B (en) Drug interaction modeling and risk assessment method, terminal device and storage medium
CN110880361A (en) Personalized accurate medication recommendation method and device
US20140006044A1 (en) System and method for preparing healthcare service bundles
CN106793957B (en) Medical system and method for predicting future outcome of patient care
JP6159872B2 (en) Medical data analysis system, medical data analysis method, and storage medium
US20210326995A1 (en) Claim settlement anti-fraud method, apparatus, device, and storage medium based on graph computation technology
CN113366499A (en) Associating population descriptors with trained models
US20210257067A1 (en) State transition prediction device, and device, method, and program for learning predictive model
WO2017017554A1 (en) Reliability measurement in data analysis of altered data sets
CN111612636A (en) Abnormal medical insurance data detection system and method based on dual clustering algorithm
CN109255721A (en) Insurance recommended method, equipment, server and readable medium based on Cost Forecast
CN107239722B (en) Method and device for extracting diagnosis object from medical document
US20200051698A1 (en) Precision clinical decision support with data driven approach on multiple medical knowledge modules
US20180336300A1 (en) System and method for providing prediction models for predicting changes to placeholder values
CN110648734B (en) Method and device for identifying abnormal cases in medical treatment based on mean value
US20210118536A1 (en) Automatic clinical report generation
CN113436738A (en) Method, device, equipment and storage medium for managing risk users
Lin et al. Cleaning of anthropometric data from PCORnet electronic health records using automated algorithms
CN109545319B (en) Prescription alarm method based on knowledge relation analysis and terminal equipment
US20180101658A1 (en) Method of and system for determining risk of an individual to contract clostridium difficile infection
EP3762944A1 (en) Method and apparatus for monitoring a human or animal subject

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant