CN111986034B - Medical insurance group fraud monitoring method, system and storage medium - Google Patents

Medical insurance group fraud monitoring method, system and storage medium Download PDF

Info

Publication number
CN111986034B
CN111986034B CN202010818035.1A CN202010818035A CN111986034B CN 111986034 B CN111986034 B CN 111986034B CN 202010818035 A CN202010818035 A CN 202010818035A CN 111986034 B CN111986034 B CN 111986034B
Authority
CN
China
Prior art keywords
patients
similarity
node
nodes
patient
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010818035.1A
Other languages
Chinese (zh)
Other versions
CN111986034A (en
Inventor
王琼
邬正国
李志峰
谢提提
胡磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Yunnao Data Technology Co ltd
Original Assignee
Jiangsu Yunnao Data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Yunnao Data Technology Co ltd filed Critical Jiangsu Yunnao Data Technology Co ltd
Priority to CN202010818035.1A priority Critical patent/CN111986034B/en
Publication of CN111986034A publication Critical patent/CN111986034A/en
Application granted granted Critical
Publication of CN111986034B publication Critical patent/CN111986034B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Abstract

The invention provides a medical insurance group fraud monitoring method, which comprises the following steps: step S1, generating an analysis dataset of the patient; step S2, calculating the similarity between patients; step S3, digging extremely large groups which are highly similar to each other; and step S4, manually examining and judging the suspicious group according to the visit details of the group members. The invention also provides a medical insurance group fraud monitoring system, which comprises: a memory storing a computer program; a processor for executing the computer program, the computer program when executed performing the steps of the method as hereinbefore described except for step S4. The method is convenient for accurately and efficiently identifying the abnormal group with the medical insurance fund fraud violation behaviors.

Description

Medical insurance group fraud monitoring method, system and storage medium
Technical Field
The invention relates to the field of medical insurance fund anti-fraud, in particular to a medical insurance group fraud monitoring method and system.
Background
At present, the application system in the field of medical insurance anti-fraud in China mainly establishes a rule base by summarizing the fraud cases which occur in actual business, the patterns of fraud behaviors are more and more complex and various as time goes on, and the solidified rule base is difficult to identify new fraud behaviors. And suspected fraud is determined by a fraud detection rule defined by experts, the selection of a threshold value and a weight in the rule is very difficult, the diagnosis and treatment speciality is strong, the fraud is relatively concealed in treatment, and a certain unreasonable fraud detection mode which is determined according to the rule is also provided, so that the accuracy is extremely low.
In reality, due to the concealment of fraudulent conduct, the complexity of the conduct subject, the high degree and diversity of the fraudulent case and the limitation of the anti-fraud capability of the medical insurance department, the visual judgment of the fraudulent conduct is very difficult, and the case of the fraudulent conduct is difficult to screen directly. However, from the background of big data, the fraudulent conduct of any main body is necessarily recorded in the medical insurance data, and the data of the medical institution of each agent is recorded in the data management system in the medical insurance field, so that the potential medical insurance fraudulent conduct rule can be found from the medical treatment conduct by means of professional data analysis technology, and a mode is formed for prejudging for medical service conduct detection, the existence of the fraudulent conduct is found, and the loss of the medical insurance fund is avoided.
Generally speaking, the monitoring of medical insurance fraud has very important effect and meaning, utilizes big data mining algorithm, excavates the rule that hides behind the data, through the mode of constructing medical fraud intelligent monitoring model, and accurate discernment has the group of medical insurance fund fraud violation act of law to realize:
(1) the improper use of the medical insurance fund is checked out, and the meaningless waste of the medical insurance fund is reduced.
(2) Aiming at the suspicious fraud behaviors with a certain range, the working efficiency is improved.
(3) Potential covert fraud outside of business rules is sought.
Driven by the interest, fraud cases occur at high frequency, and personal violations that were previously only participants have evolved into now organized group fraud violations. In current medical insurance fraud, medical insurance funds involved in group fraud are huge, for example, illegal organizations frequently purchase medicines within the medical insurance pool by purchasing medical insurance cards of numerous participants and insurers and passing the personnel to hospitals to seek medical advice.
Disclosure of Invention
In view of this, the invention aims to provide a medical insurance group fraud monitoring method and system, which realize the transition from manual bill-drawing auditing to big-data omnibearing full-flow intelligent monitoring of medical insurance fund monitoring and are convenient for accurately and efficiently identifying abnormal groups with medical insurance fund fraud violation behaviors.
In a first aspect, an embodiment of the present invention provides a medical insurance group fraud monitoring method, including the following steps:
step S1, generating an analysis dataset of the patient;
step S2, calculating the similarity between patients;
step S3, the extremely large groups that are highly similar to each other are mined.
Further, in the method, the first step of the method,
with P ═ P1,p2,...,pmDenotes the set of patients to be treated, G ═ G1,g2,...,gnRepresents a population with similar visit behavior;
Figure GDA0003555397020000021
and G for any two patients in Gi、gjThe diagnosis behaviors are highly similar;
the visit behavior refers to the activity of a patient in one visit; b, the behavior b of the patient p at a certain time t and a certain place s for medical treatment is recorded as (p, t, s); site s includes a doctor or department or hospital;
similar behavior means that different patients p have undergone the same type of visit within a certain period of time; using SB (p)i,pj) A set representing similar behavior in any two patients;
step S1 specifically includes:
the following fields are extracted from the visit data imported from the hospital into the patient:
1) the date of the visit;
2) hospital ID and/or department ID and/or doctor ID;
3) a patient ID;
step S2 specifically includes:
firstly, calculating the similarity of similar behaviors; the similarity of the similar behaviors is used for measuring the similarity of the two similar behaviors; if b isi=(pi,ti,si) And bj=(pj,tj,sj) Is a similar behavior, then si=sj,|ti-tjLess than or equal to T; t is a time interval; the calculation formula of the similarity of similar behaviors is as follows:
Figure GDA0003555397020000022
then, the similarity between the patients is calculated according to the formula:
Figure GDA0003555397020000023
wherein, N (p)i) Indicates that the patient p is present within a certain period of timeiNumber of visits, N (p)j) Indicates that the patient p is present within a certain period of timejNumber of visits;
step S3 specifically includes:
firstly, calculating the similarity Sim between each patient and other patients according to a formula (2), then screening the patients with the Sim larger than the similarity threshold value between the patients, and outputting a sparse matrix of the highly similar patients;
then outputting the associated network map among patients according to the sparse matrix; in the associated network graph, N represents a set of nodes; representing a set of edges between the connection nodes by E; w represents the degree of similarity between nodes, then Wij=Sim(pi,pj),pi,pj∈N;
After the associated network maps among the patients exist, the large groups which are highly similar to each other in the associated network maps are continuously mined.
Further, in the method, the first step of the method,
in step S3, the mining of the extremely large groups that are highly similar to each other in the associated network graph specifically includes:
the subset is a completely connected closed subgraph in the associated network graph, namely any two nodes in the subset are connected by edges; a subset is used to represent a population, i.e., any two patients in the subset are similar;
a subset is called a maximal subset if it can no longer be expanded into a larger subset by any one or more nodes; representing a population by a maximal subset;
according to the definition of the maximum subsets, groups can be positioned in the associated network maps among patients, and then all the maximum subsets in the associated network maps are continuously mined, namely all the groups are found;
the set of the nodes meeting the condition that the population at least comprises h members, and each member has at least h-1 edges is an h-node set;
and H represents an H-node set, then H is { n: n belongs to N, d (N) is more than or equal to H-1, d (N) is the degree of the node N and represents the number of edges of the node N, namely H represents a set of nodes with at least H-1 edges; using MH diagram to represent a subgraph formed by nodes in H in the inter-patient association network map;
the method comprises the steps of searching an H-node set H meeting the group member number H in an inter-patient association network map, deducing an MH map of the H-node set H, and then exhaustively and maximally sub-set on the MH map to excavate out all groups.
Preferably, in the method, after the MH graph is derived in step S3, the first X% nodes with the highest node similarity are selected as seed nodes, and the maximum subset enumeration based on the partition is performed in the MH graph with the seed nodes, so as to obtain the whole population;
the calculation formula of the node similarity is as follows:
Figure GDA0003555397020000031
wherein the content of the first and second substances,
(1) d (n) represents the degree of the node n, i.e. the number of edges of the node n;
(2) nei (n) represents the set of neighbor nodes of node n;
(3)Wnmrepresenting the similarity between node n and its neighbor node m.
Further, the inter-patient similarity threshold is set to 0.8.
Further, h is set to any number of 3 to 6.
Further, X% is set to 30%.
Further, after step S3, the method further includes:
and step S4, manually examining and judging the suspicious group according to the visit details of the group members.
In a second aspect, an embodiment of the present invention provides a medical insurance group fraud monitoring system, including:
a memory storing a computer program;
a processor for executing the computer program, the computer program when executed performing the steps of the method as hereinbefore described except for step S4.
In a third aspect, an embodiment of the present invention further provides a storage medium, in which a computer program is stored, the computer program being configured to, when executed, perform the steps of the method as above except for step S4.
The invention has the advantages that:
1) the manual auditing cost is reduced, and the manual auditing efficiency is improved;
in fact, since fraudulent patients account for only a small portion of the entire patient population, only a very small amount of the hospital's massive medical detail data is a fraudulent record. Whether random sampling or sampling according to a certain rule, the patient who is extracted is a normal-behavior patient with great probability. The method provided by the invention can automatically separate the group from massive data through the model and output the diagnosis behavior index of the group, thereby not only reducing the range of suspected patients, but also improving the efficiency of manual examination.
2) The manual auditing accuracy is improved, and the medical insurance fund loss is reduced;
at present, in the field of anti-fraud of medical insurance, a certain suspected fraud behavior generation rule base is defined by experts according to past experience, so as to circumscribe suspected patients. However, over time, the fraudulent behaviors of the fraudulent group are more and more concealed and varied, and the rule base has certain ineffectiveness. According to the method provided by the invention, the real-time visit behavior data is modeled, the rules among the data are learned, the suspect group is accurately identified, the accuracy of manual examination is increased, and the loss of the medical insurance fund is reduced.
Drawings
FIG. 1 is a flow chart of a method in an embodiment of the invention.
FIG. 2 is an exemplary diagram of a sparse matrix in an embodiment of the present invention.
Fig. 3 is an exemplary illustration of an inter-patient association network map in an embodiment of the invention.
FIG. 4 is an exemplary diagram of a subset in an embodiment of the invention.
FIG. 5 is an exemplary diagram of a maximum subset in an embodiment of the invention.
Fig. 6 is an illustration of MH in an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The embodiment of the invention provides a medical insurance group fraud monitoring method, which comprises the following steps:
definition 1, population:
within the group of patients, a group of people have highly similar treatment behaviors;
with P ═ P1,p2,...,pmDenotes the set of patients to be treated, G ═ G1,g2,...,gnRepresents a population with similar visit behavior;
Figure GDA0003555397020000041
and G for any two patients in Gi、gjThe diagnosis behaviors are highly similar;
there may be multiple populations within P that behave similarly.
Definition 2, visit behavior:
the visit behavior refers to the activity of a patient in one visit;
patient p hospitalizes at a time t, a place s (doctor or department or hospital)Behavior b, which may be recorded as b ═ (p, t, s); e.g. b1May be ("p ═ ID 01", "t ═ 2020/7/15", "s ═ sector/Department/Hospital");
in this embodiment, the default setting of s is a doctor, and may be switched to a department or a hospital. Because, under the actual scene of seeing a doctor, only if the patient is diagnosed with a certain disease, the doctor can prescribe proper medicine according to the disease condition. If under a special condition, a cheater is not ill and can instruct a doctor to take a prescription at will, the cheater can utilize the 'convenience' to maximize the utilization rate of the medical insurance card held in the hand, namely, the doctor is frequently visited to take a doctor to take a prescription frequently. In reality, perhaps a fraudster may easily exercise such "convenience" on some doctors, but it is difficult to achieve such "convenience" throughout a department or hospital;
definition 3, similar behavior:
similar behavior means that different patients p have undergone the same type of visit within a certain period of time;
if different patients p visit the same doctor or the same department or the same hospital within the time interval T, the patients are regarded as having the same type of medical treatment; the threshold value of the time interval T is set as 3 days by default, and the threshold value of the time interval T can also be set autonomously according to a specific scene;
if b is1=(“p1=ID01”,“t=2020/7/15”,“Doctor=ID123”),b2=(“p2ID02 "," t 2020/7/16 "," sector ID123 "), then b1And b2Is a similar behavior;
using SB (p)i,pj) A set representing similar behavior in any two patients; namely SB (p)i,pj) Is formed by pi,pjAll similar behaviors of the two patients over a certain period of time;
the method comprises the following steps:
step S1, generating an analysis dataset of the patient;
the following fields are extracted from the visit data imported from the hospital into the patient:
1) the date of the visit, in days;
2) hospital ID and/or department ID and/or doctor ID as classification fields;
3) a patient ID; if a plurality of treatment records exist in the same classification field on the same day, only one record is reserved, namely the patient Id is unique by taking the day and the classification field as a unit;
step S2, calculating the similarity between patients;
because the doctor seeing behaviors of the groups are highly similar, the similar doctor seeing behaviors are found out firstly, and then the similarity of the similar behaviors is calculated, wherein the higher the similarity value of the similar behaviors is, the more highly similar the doctor seeing behaviors are; finally, on the basis of the similarity of the similar behaviors, the similarity between the patients is calculated; considering the patients with similarity greater than the threshold value of the similarity between the patients as the highly similar patients, and sorting the patients in a descending order according to the similarity between the patients;
definition 4, similarity of similar behaviors:
the similarity of the similar behaviors is used for measuring the similarity of the two similar behaviors; if b isi=(pi,ti,si) And bj=(pj,tj,sj) Is a similar behavior, then si=sj,|ti-tjLess than or equal to T; therefore, the similarity of the similar behaviors is only related to the time interval, and the shorter the time interval of the similar behaviors is, the greater the similarity between the diagnosis behaviors is; therefore, the calculation formula of the similarity of similar behaviors is:
Figure GDA0003555397020000051
definition 5, similarity between patients:
the inter-patient similarity refers to the similarity of the visit behavior between two patients within a certain period of time; i.e. the relationship between the sum of the similarities of all similar behaviors of two patients within the time interval T and their visit behavior; therefore, the similarity between patients is calculated by the formula:
Figure GDA0003555397020000052
wherein, N (p)i) Indicates that the patient p is present within a certain period of timeiNumber of visits, N (p)j) Indicates that the patient p is present within a certain period of timejNumber of visits; obviously, Sim (p)i,pj) The larger, the patient piAnd pjThe greater the similarity between them;
the threshold of the similarity between patients is set as 0.8 by default, and the size of the threshold can be adjusted automatically according to specific conditions; wherein, the closer the threshold value is to 1, the higher the similarity among patients is, and the closer the threshold value is to 0, the lower the similarity among patients is, namely, no correlation exists among patients;
step S3, digging extremely large groups which are highly similar to each other;
firstly, calculating the similarity Sim between each patient and other patients according to a formula (2), then screening the patients with the sims larger than the similarity threshold value between the patients, and finally outputting a sparse matrix of the highly similar patients; an example of a sparse matrix is shown in FIG. 2;
then outputting the associated network map among patients according to the sparse matrix; that is, the relationship between patients; an example of an associated network graph is shown in figure 3;
defining 6, wherein the association network map means that the association relation between the patients is expressed in the form of a graph by using the index of the similarity between the patients, wherein the graph consists of nodes and edges, the nodes represent the patients, the edges represent the similarity between the patients, and the length of the edges represents the similarity between the patients;
(1) representing an associated network graph by Map (length of node, edge and edge);
(2) representing a set of nodes by N;
(3) representing a set of edges between the connection nodes by E;
(4) representing the similarity degree edge between the nodes by W, then Wij=Sim(pi,pj),pi,pj∈N;
After the associated network maps among the patients exist, according to the characteristic that each individual in the group is similar to each other, the maximal groups which are highly similar to each other in the associated network maps are continuously mined; the method comprises the following specific steps:
definition 7, subset:
the subset is a completely connected closed subgraph in the associated network graph, namely any two nodes in the subset are connected by edges; a subset is used to represent a population, i.e., any two patients in the subset are similar; such as shown in fig. 4;
definition 8, maximum subset:
a subset is called a maximal subset if it can no longer be expanded into a larger subset by any one or more nodes; representing a population by a maximal subset; such as shown in fig. 5;
according to the definition of the maximum subsets, groups can be positioned in the associated network maps among patients, and then all the maximum subsets in the associated network maps are continuously mined, namely all the groups are found;
more than 2 persons can form a group, the number of group members is different, the number of edges of nodes in the associated network graph is different, and the group can be appointed to be at least composed of h members; the influence of the groups with different magnitudes on the medical insurance fund is different, and the more the number of people in the groups is, the higher the cheating insurance sum is; h is set to 3 by default; the value of h can also be modified according to the actual situation;
defining a 9, h-node set;
the number of members in the group is different, the number of edges connecting the nodes is also different, and if the appointed group at least comprises h members, each member has at least h-1 edges; the set of the nodes meeting the condition that the population at least comprises h members, and each member has at least h-1 edges is an h-node set;
and H represents an H-node set, then H is { n: n belongs to N, d (N) is more than or equal to H-1, d (N) is the degree of the node N and represents the number of edges of the node N, namely H represents a set of nodes with at least H-1 edges; using MH diagram to represent a subgraph formed by nodes in H in the inter-patient association network map;
assuming that H is 4, taking fig. 3 as an example, H is { a, B, C, D, E, F, G, I }, MH diagram is shown in fig. 6;
searching an H-node set H meeting the group member number H in the inter-patient association network map, and pushing the H-node set H to an MH map of the inter-patient association network map, and excavating all groups by exhausting a maximum subset on the MH map, wherein the process greatly simplifies the calculated amount;
to further simplify the computation, an exhaustive maximum subset of partitions is used on the MH map;
definition 10, node similarity:
the node similarity is measured by C, the similarity of the node n and other adjacent nodes, namely the average similarity to other adjacent nodesnTo represent the similarity of the node n, then there are
Figure GDA0003555397020000071
Wherein the content of the first and second substances,
(1) d (n) represents the degree of the node n, i.e. the number of edges of the node n;
(2) nei (n) represents the set of neighbor nodes of node n;
(3)Wnmrepresenting the similarity between the node n and the adjacent node m;
the higher the node similarity is, the more similar the node is to the adjacent nodes, so that before extracting the subgraph from the MH graph, the node with the high node similarity is found out as the seed node; the first 30% nodes with the highest node similarity can be selected as seed nodes, and the seed nodes are used for carrying out partition-based maximum subset enumeration in the MH graph so as to obtain all groups;
and step S4, manually examining and judging the suspicious group according to the visit details of the group members.
The group is only to illustrate that the treatment behaviors of any two patients are highly similar, but not to illustrate that all people in the group are fraudsters, for example, a normal patient can often be reviewed with several familiar relatives (this is often the case for patients in a nursing home); thus, these normal patients would be mined as a population because of the highly similar behavior; that is, the population is also divided into a normal population and a suspicious population, and a normal patient population aggregated due to an accidental or special reason is called a normal population, and an abnormal population is called a suspicious population.
Therefore, the output of the group is still sent to manual review; the method can be based on the visit details of the group members, such as: the frequency of the treatment, the period of the treatment, the cost of the treatment, the department and the doctor of the treatment, the commonly used medicines and the quantity and other indexes assist in manual examination and judgment.
The embodiment of the invention also provides a medical insurance group fraud monitoring system, which comprises:
a memory storing a computer program;
a processor for executing the computer program, the computer program being operable to perform the steps of the method as hereinbefore described except for step S4.
Embodiments of the present invention also propose a storage medium having stored therein a computer program configured to perform the steps of the method as described above except for step S4 when executed.
Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention has been described in detail with reference to examples, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, which should be covered by the claims of the present invention.

Claims (9)

1. A medical insurance group fraud monitoring method, characterized in that the method comprises the steps of:
step S1, generating an analysis dataset of the patient;
step S2, calculating the similarity between patients;
step S3, digging extremely large groups which are highly similar to each other;
with P ═ P1,p2,…,pm} tableSet of patients indicated for treatment with G ═ G1,g2,…,gnRepresents a population with similar visit behavior;
Figure FDA0003561994950000011
and G for any two patients in Gi、gjThe visit behaviors are highly similar;
the visit behavior refers to the activity of a patient in one visit; b, the behavior b of the patient p at a certain time t and a certain place s for medical treatment is recorded as (p, t, s); site s includes a doctor or department or hospital;
similar behavior means that different patients p have undergone the same type of visit within a certain period of time; using SB (p)i,pj) A set representing similar behavior in any two patients;
step S1 specifically includes:
the following fields are extracted from the visit data imported from the hospital into the patient:
1) the date of the visit;
2) hospital ID and/or department ID and/or doctor ID;
3) a patient ID;
step S2 specifically includes:
firstly, calculating the similarity of similar behaviors; the similarity of the similar behaviors is used for measuring the similarity of the two similar behaviors; if b isi=(pi,ti,si) And bj=(pj,tj,sj) Is a similar behavior, then si=sj,|ti-tjLess than or equal to T; t is a time interval; the calculation formula of the similarity of similar behaviors is as follows:
Figure FDA0003561994950000012
then, the similarity between the patients is calculated according to the formula:
Figure FDA0003561994950000013
wherein, N (p)i) Indicates that the patient p is present within a certain period of timeiNumber of visits, N (p)j) Indicates that the patient p is present within a certain period of timejNumber of visits;
step S3 specifically includes:
firstly, calculating the similarity Sim between each patient and other patients according to a formula (2), then screening the patients with the Sim larger than the similarity threshold value between the patients, and outputting a sparse matrix of the highly similar patients;
then outputting the associated network map among patients according to the sparse matrix; in the associated network graph, N represents a set of nodes; representing a set of edges between the connection nodes by E; w represents the degree of similarity between nodes, then Wij=Sim(pi,pj),pi,pj∈N;
After the associated network maps among the patients exist, the large groups which are highly similar to each other in the associated network maps are continuously mined.
2. The method of fraud monitoring of medical insurance groups according to claim 1, wherein in the method,
in step S3, the mining of the extremely large groups that are highly similar to each other in the associated network graph specifically includes:
the subset is a completely connected closed subgraph in the associated network graph, namely any two nodes in the subset are connected by edges; a subset is used to represent a population, i.e., any two patients in the subset are similar;
a subset is called a maximal subset if it can no longer be expanded into a larger subset by any one or more nodes; representing a population by a maximal subset;
according to the definition of the maximum subsets, groups can be positioned in the associated network maps among patients, and then all the maximum subsets in the associated network maps are continuously mined, namely all the groups are found;
the set of the nodes meeting the condition that the population at least comprises h members, and each member has at least h-1 edges is an h-node set;
using H to represent H-node set, then H ═ N ∈ N, d (N) ≧ H-1}, d (N) is the degree of node N, representing the number of edges of node N, that is, H represents the set of nodes with at least H-1 edges; using MH diagram to represent a subgraph formed by nodes in H in the inter-patient association network map;
the method comprises the steps of searching an H-node set H meeting the group member number H in an inter-patient association network map, deducing an MH map of the H-node set H, and then exhaustively and maximally sub-set on the MH map to excavate out all groups.
3. The medical insurance group fraud monitoring method according to claim 2, wherein in the method, after the MH graph is derived in step S3, the first X% nodes with the highest node similarity are selected as seed nodes, and the MH graph is subjected to partition-based maximum subset enumeration with the seed nodes, thereby obtaining the whole group;
the calculation formula of the node similarity is as follows:
Figure FDA0003561994950000021
wherein the content of the first and second substances,
(1) d (n) represents the degree of the node n, i.e. the number of edges of the node n;
(2) nei (n) represents the set of neighbor nodes of node n;
(3)Wnmrepresenting the similarity between node n and its neighbor node m.
4. The method of fraud monitoring of medical insurance groups according to claim 1, wherein in the method,
the inter-patient similarity threshold was set to 0.8.
5. The medical insurance group fraud monitoring method of claim 2, wherein in the method,
h is set to any one of 3 to 6.
6. The medical insurance group fraud monitoring method of claim 3, wherein in the method,
x% is set to 30%.
7. The medical insurance group fraud monitoring method of any one of claims 1 to 6, further comprising, after step S3:
and step S4, manually examining and judging the suspicious group according to the visit details of the group members.
8. A medical insurance group fraud monitoring system, comprising:
a memory storing a computer program;
a processor for running the computer program, the computer program when running performing the steps of the method of any one of claims 1 to 6.
9. A storage medium, characterized in that it comprises,
the storage medium has stored therein a computer program configured to perform the steps of the method of any one of claims 1 to 6 when executed.
CN202010818035.1A 2020-08-14 2020-08-14 Medical insurance group fraud monitoring method, system and storage medium Active CN111986034B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010818035.1A CN111986034B (en) 2020-08-14 2020-08-14 Medical insurance group fraud monitoring method, system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010818035.1A CN111986034B (en) 2020-08-14 2020-08-14 Medical insurance group fraud monitoring method, system and storage medium

Publications (2)

Publication Number Publication Date
CN111986034A CN111986034A (en) 2020-11-24
CN111986034B true CN111986034B (en) 2022-05-10

Family

ID=73434976

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010818035.1A Active CN111986034B (en) 2020-08-14 2020-08-14 Medical insurance group fraud monitoring method, system and storage medium

Country Status (1)

Country Link
CN (1) CN111986034B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113011971A (en) * 2021-03-31 2021-06-22 深圳前海微众银行股份有限公司 Risk measurement method, device and system and computer storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109919781A (en) * 2019-01-24 2019-06-21 平安科技(深圳)有限公司 Case recognition methods, electronic device and computer readable storage medium are cheated by clique
CN110413707A (en) * 2019-07-22 2019-11-05 百融云创科技股份有限公司 The excavation of clique's relationship is cheated in internet and checks method and its system
CN110827159A (en) * 2019-11-11 2020-02-21 上海交通大学 Financial medical insurance fraud early warning method, device and terminal based on relational graph

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140172439A1 (en) * 2012-12-19 2014-06-19 Verizon Patent And Licensing Inc. Organized healthcare fraud detection

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109919781A (en) * 2019-01-24 2019-06-21 平安科技(深圳)有限公司 Case recognition methods, electronic device and computer readable storage medium are cheated by clique
CN110413707A (en) * 2019-07-22 2019-11-05 百融云创科技股份有限公司 The excavation of clique's relationship is cheated in internet and checks method and its system
CN110827159A (en) * 2019-11-11 2020-02-21 上海交通大学 Financial medical insurance fraud early warning method, device and terminal based on relational graph

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于社交网络的犯罪团伙发现算法研究;潘潇 王斌君;《软件导刊》;湖北省科技传媒有限责任公司;20181215;第17卷(第12期);1-10 *

Also Published As

Publication number Publication date
CN111986034A (en) 2020-11-24

Similar Documents

Publication Publication Date Title
US11631175B2 (en) AI-based heat map generating system and methods for use therewith
US10943681B2 (en) Global multi-label generating system
US7711404B2 (en) Patient data mining for lung cancer screening
US8682693B2 (en) Patient data mining for lung cancer screening
CN109636061A (en) Training method, device, equipment and the storage medium of medical insurance Fraud Prediction network
Dey et al. Study and analysis of data mining algorithms for healthcare decision support system
Chen et al. Identifying patients in target customer segments using a two-stage clustering-classification approach: A hospital-based assessment
Jay et al. Analysis of social communities with iceberg and stability-based concept lattices
US20160125159A1 (en) System for management of health resources
WO2022233121A1 (en) Unsupervised medical behavior compliance assessment method based on electronic medical record
US20220005565A1 (en) System with retroactive discrepancy flagging and methods for use therewith
US20120041784A1 (en) Computerized Surveillance of Medical Treatment
CN111986034B (en) Medical insurance group fraud monitoring method, system and storage medium
CN110827159A (en) Financial medical insurance fraud early warning method, device and terminal based on relational graph
US20220051114A1 (en) Inference process visualization system for medical scans
Marazza et al. Comparing process models for patient populations: application in breast cancer care
CN112991079B (en) Multi-card co-occurrence medical treatment fraud detection method, system, cloud end and medium
WO2022057057A1 (en) Method for detecting medicare fraud, and system and storage medium
Srinivasan et al. Examining disease multimorbidity in US hospital visits before and during COVID-19 pandemic: a graph analytics approach
Alkhafaji et al. Clean medical data and predict heart disease
CN114550930A (en) Disease prediction method, device, equipment and storage medium
Lu Causal inference for observational studies/real-world data
CN112884593A (en) Medical insurance fraud and insurance behavior detection method and early warning device based on graph cluster analysis
CN111640490A (en) Hospital outpatient self-service recommendation method based on big data
Gopalakrishnan et al. A Novel Deep Learning-Based Heart Disease Prediction System Using Convolutional Neural Networks (CNN) Algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant