CN113822365A - Medical data storage and big data mining method and system based on block chain technology - Google Patents

Medical data storage and big data mining method and system based on block chain technology Download PDF

Info

Publication number
CN113822365A
CN113822365A CN202111144685.3A CN202111144685A CN113822365A CN 113822365 A CN113822365 A CN 113822365A CN 202111144685 A CN202111144685 A CN 202111144685A CN 113822365 A CN113822365 A CN 113822365A
Authority
CN
China
Prior art keywords
hospital
abnormal
vector
acquiring
medication
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111144685.3A
Other languages
Chinese (zh)
Other versions
CN113822365B (en
Inventor
刘玉棚
陈梓城
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Hengsheng Yuntai Network Technology Co ltd
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN202111144685.3A priority Critical patent/CN113822365B/en
Publication of CN113822365A publication Critical patent/CN113822365A/en
Application granted granted Critical
Publication of CN113822365B publication Critical patent/CN113822365B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention relates to the technical field of block chains, in particular to a medical data storage and big data mining method and system based on a block chain technology. The method comprises the following steps: acquiring a medicine use characteristic vector of a hospital to divide the hospital into a normal hospital set and an abnormal hospital set; acquiring an abnormal medication vector of each abnormal hospital, and further acquiring abnormal characteristic stability to classify the hospitals into an attention hospital set and a reference hospital set; acquiring the similarity between any two reference hospital sets, and dividing the similarity into a plurality of similar categories according to the similarity so as to acquire a first fusion vector; acquiring total variation and mean value of a reference hospital set, and acquiring a second fusion vector by combining the first fusion vector; and acquiring the acceptable degree of the hospital according to the second fusion vector and determining the corresponding hospital giving the reward. The fusion of medical data characteristics of hospitals is utilized to judge whether the hospitals have abnormal drug use conditions, so that the vicious competition condition among the hospitals is improved.

Description

Medical data storage and big data mining method and system based on block chain technology
Technical Field
The invention relates to the technical field of block chains, in particular to a medical data storage and big data mining method and system based on a block chain technology.
Background
With the development of society, the number of hospitals is increasing, and the medical health technology is greatly improved, and the hospitals generate a large amount of medical data at all times, including but not limited to the information of illness of personnel, the use of medicines, the use of medical equipment and the like.
Because medical data sets among hospitals are not shared, the situations that patients' medical records are tampered or medication is abnormal in hospitals can exist, more accurate information cannot be obtained, even malignant competition situations such as drug abuse and drug price raising can exist in some hospitals, data characteristics cannot be mined from a large amount of medical data in the prior art, and the phenomena that whether the hospitals abuse drugs and the drug price are unreasonable are difficult to accurately identify.
Disclosure of Invention
In order to solve the above technical problems, an object of the present invention is to provide a method for storing medical data and mining big data based on a block chain technique, wherein the technical scheme adopted is as follows:
in a first aspect, an embodiment of the present invention provides a method for storing medical data and mining big data based on a block chain technology, where the method includes the following steps:
traversing medical data of each block on the block chain by the target hospital to obtain a medicine use characteristic vector, wherein the block is a block for storing the medical data by the hospital; the medicine use characteristic vector is the appearance proportion of the medical data; dividing traversed hospitals into a normal hospital set and an abnormal hospital set according to the medicine use characteristic vector;
acquiring the mean value of the characteristic vectors of the traditional Chinese medicines in the normal hospital set, acquiring the abnormal medication vector of each abnormal hospital in the abnormal hospital set according to the mean value, acquiring the abnormal characteristic stability degree according to the abnormal medication vector sequence, and classifying the hospitals into an attention hospital set and a reference hospital set according to the abnormal characteristic stability degree;
acquiring the similarity between abnormal medication vector sequences of reference hospitals corresponding to any two target hospitals, and dividing the reference hospital set into a plurality of similar categories according to the similarity; acquiring a first fusion vector according to the abnormal feature stability degree and the abnormal medication vector of the reference hospital in each similar category;
acquiring the total variation of the abnormal feature stability degree and the mean value of the abnormal feature stability degree in the reference hospital set, and acquiring a second fusion vector according to the first fusion vector, the total variation and the mean value of the reference hospital set;
acquiring the abnormal degree of the hospital according to the second fusion vector, and acquiring the acceptable degree of the hospital according to the abnormal degree and the workload; the workload is the ratio of the number of the blocks traversed by the target hospital to the total number of blocks on a block chain; the corresponding hospital to which the reward is given is determined according to the acceptable degree of each hospital.
Preferably, the step of obtaining a normal hospital set and an abnormal hospital set according to the feature vector includes:
and clustering the traversed feature vectors of all the hospitals, wherein the feature vectors are intensively distributed to be a normal hospital set, and the rest are abnormal hospital sets.
Preferably, the step of obtaining an abnormal medication vector sequence of each hospital in the abnormal hospital set further includes:
and acquiring the mean value of all the feature vectors in the normal hospital set, calculating the difference value between each feature vector in the abnormal hospital set and the mean value, and taking the absolute value of the difference value to obtain the abnormal medication vector corresponding to the abnormal hospital.
Preferably, the step of obtaining the stability degree of the abnormal feature according to the abnormal medication vector sequence includes:
and obtaining the modulus of each abnormal medication vector in the abnormal medication vector sequence to obtain an abnormal medication degree sequence, and obtaining the stability degree of the abnormal characteristic according to the proportion of non-zero elements in the abnormal medication degree sequence.
Preferably, the step of obtaining the similarity between the reference hospital sets corresponding to any two hospitals includes:
acquiring an intersection between two reference hospital sets corresponding to two target hospitals, wherein each hospital in the intersection corresponds to two abnormal medication vector sequences, calculating an intersection ratio between the two reference hospital sets and an abnormal characteristic stability degree of the two abnormal medication vector sequences, and acquiring the similarity according to the intersection ratio and the two abnormal characteristic stability degrees.
Preferably, the step of obtaining a first fusion vector according to the abnormal feature stability degree and the abnormal medication vector of the reference hospital in each similar category includes:
and acquiring a ratio of the abnormal feature stability degree of any reference hospital in the similar category to the sum of the abnormal feature stability degrees of all reference hospitals in the similar category, and acquiring the first fusion vector by taking the ratio as a first weight of the abnormal medication vector corresponding to the reference hospital.
Preferably, the step of obtaining a second fusion vector according to the first fusion vector, the total variation and the mean value includes:
and acquiring a ratio of the mean value to the total variation, and acquiring a second fusion vector by taking the ratio as a second weight of the first fusion vector.
Preferably, the step of obtaining the acceptable level of the hospital according to the abnormal medication level and the workload includes:
and acquiring the product of the abnormal medication degree of the hospital and the workload, wherein the product is the acceptable degree.
In a second aspect, another embodiment of the present invention provides a medical data storage and big data mining system based on a block chain technology, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the above method when executing the computer program.
The invention has the following beneficial effects: dividing hospitals into a normal hospital set and an abnormal hospital set by obtaining the drug use characteristic vector of each hospital, obtaining the abnormal characteristic stability degree of the abnormal hospital in the abnormal hospital set, and dividing the hospitals into a reference hospital set and a focus hospital set according to the abnormal characteristic stability degree; and further dividing the reference hospital set into a plurality of similar categories according to the similarity between the abnormal medication vector sequences among the reference hospitals, acquiring a first fusion vector of each similar category, acquiring a second fusion vector according to the total variation and the mean value of the first fusion vector and the reference hospital set, further acquiring the acceptable degree of each hospital, and distributing rewards to the hospitals with large acceptable degrees. The method comprises the steps of analyzing and mining a large amount of medical data to obtain useful information of each hospital, fusing the useful information of each hospital, judging whether the hospital has abnormal medicine use and unreasonable medicine price, and improving the malignant competition condition among hospitals.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions and advantages of the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a flowchart of a method for storing medical data and mining big data based on a block chain technique according to an embodiment of the present invention.
Detailed Description
To further illustrate the technical means and effects of the present invention for achieving the predetermined objects, the following detailed description of the embodiments, structures, features and effects of the method and system for storing and mining medical data based on block chain technology according to the present invention will be made with reference to the accompanying drawings and preferred embodiments. In the following description, different "one embodiment" or "another embodiment" refers to not necessarily the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
The embodiment of the invention is applied to specific scenes among hospitals, and aims to solve the problems of unreasonable medicine price and malicious competition among the hospitals, the hospitals are divided into a normal hospital set and an abnormal hospital set by acquiring the medicine use characteristic vector of each hospital, the abnormal characteristic stability degree of the abnormal hospital in the abnormal hospital set is acquired, and the hospitals are divided into a reference hospital set and a concerned hospital set according to the abnormal characteristic stability degree; and further dividing the reference hospital set into a plurality of similar categories according to the similarity between the abnormal medication vector sequences among the reference hospitals, acquiring a first fusion vector of each similar category, acquiring a second fusion vector according to the total variation and the mean value of the first fusion vector and the reference hospital set, further acquiring the acceptable degree of each hospital, and distributing rewards to the hospitals with large acceptable degrees. More useful information is obtained according to analysis and mining of big data, and the useful information is fused to judge whether drugs are abused or not and the medicine price is unreasonable among hospitals, so that malignant medical competition is improved better.
The following describes a specific scheme of a method and a system for storing medical data and mining big data based on a block chain technology in detail with reference to the accompanying drawings.
Referring to fig. 1, a flowchart of a method for storing medical data and mining big data based on a blockchain technique according to an embodiment of the present invention is shown, where the method includes the following steps:
step S100, the target hospital traverses the medical data of each block on the block chain to obtain a medicine use characteristic vector, wherein the block is a block for storing the medical data of the hospital; the medicine use characteristic vector is the appearance proportion of the medical data; and dividing the traversed hospitals into a normal hospital set and an abnormal hospital set according to the medicine use characteristic vector.
Medical data is data generated by each hospital during daily work, including: in order to share data of each hospital, a large amount of generated medical data are packed into blocks according to time sequence, and the blocks are connected to a historical block chain, each hospital can traverse the historical block chain, and each block on the historical block chain contains all medical data generated and stored by one hospital.
Preferably, in order to traverse more blocks in the block chain as quickly as possible and increase the diversity and generalization of the final data, a readable probability p is set for each block, and then the unreadable probability of the block is 1-p. When the hospital traverses the blocks, it may not be necessary to sequentially traverse the medical data in each block, but rather it may be preferable to choose to traverse the medical data in the blocks with a high probability of readability.
The method for acquiring the medicine use characteristic vector specifically comprises the following steps: any hospital which generates a new block is selected as a target hospital, and after the target hospital traverses n blocks, all medical data in the n blocks can be acquired. In the data read by the target hospital, a total of N medical data including m diagnosis results of the patient correspond to N medical bills, and the number of types of medicines in the N medical bills is P; among the N prescriptions, the data related to hospital k is NkBars, i.e. hospital k corresponds to NkAnd (4) taking the medicine bills. Record the medicine use characteristic vector of hospital k as vkmThe dimension of the drug usage feature vector is the number P of drug types, where each dimension corresponds to each drug in the N drug orders, and the value of each dimension is:
Figure BDA0003285218130000041
wherein N iskRepresents the number of data concerning hospital k among the N prescription data; a represents the p-th dimension corresponding to the medicine in NkThe number of occurrences in an individual medication order; and b represents the number of times that the medicine corresponding to the p-th dimension appears in the N medicine bills.
It should be noted that if
Figure BDA0003285218130000042
The smaller the value of (a), the probability that the medicine is used by the hospital k is consistent with the probability that the medicine is used by all hospitals, and then the hospital k is considered to normally use the medicine; if it is not
Figure BDA0003285218130000051
The larger the value of (A), the more the medicine is used by hospital kThe probability of (c) is greatly different from the probability of using the medicine in all hospitals, and the use of the medicine in hospital k is considered abnormal.
And clustering the feature vectors of all traversed hospitals, wherein the feature vectors are intensively distributed to be a normal hospital set, and the rest are abnormal hospital sets. Specifically, for clustering the medicine use characteristic vectors acquired by the sick diagnosis result m, the embodiment of the invention adopts a mean shift method, only one class of the final clustering result is defaulted, the medicine use characteristic vectors in the clustering class are distributed in a centralized manner, the hospital corresponding to each medicine use characteristic vector in the class is taken as a normal hospital normally using medicines, all the normal hospitals form a set S called a normal hospital set, and the hospitals which do not belong to the scattered distribution of the set S in addition are taken as abnormal hospitals abnormally using medicines to form an abnormal hospital set. And when the final clustering result has a plurality of categories, reserving the category with the most elements as a normal hospital set, and the others as abnormal hospital sets.
Step S200, obtaining the mean value of the characteristic vectors of the traditional Chinese medicines in the normal hospital set, obtaining the abnormal medication vector of each abnormal hospital in the abnormal hospital set according to the mean value, obtaining the abnormal characteristic stability degree according to the abnormal medication vector sequence, and classifying the hospitals into an attention hospital set and a reference hospital set according to the abnormal characteristic stability degree.
When the medical data is acquired by traversing the blocks subsequently, in order to save subsequent calculation time, only the medical data of the concerned hospital set and the reference hospital set is acquired, and other medical data in the blocks do not need to be read.
Obtaining the abnormal hospital set and the normal hospital set in the step S100, obtaining a mean value of all the medicine use feature vectors in the normal hospital set, calculating a difference value between each medicine use feature vector in the abnormal hospital set and the mean value, and obtaining an absolute value of the difference value to obtain an abnormal medicine use vector corresponding to the abnormal hospital.
Specifically, the mean value of the drug use feature vectors in the normal hospital set is recorded as v1The characteristic vector and the mean value v of the medicine use of each abnormal hospital are calculated1The absolute value of the difference value is recorded as V, and the final result V is the abnormal medication vector of the corresponding abnormal hospital.
The larger the element value in the abnormal medication vector V is, the larger the difference between the medication of the abnormal hospital and that of most other hospitals is, and the more abnormal the medication is. Therefore, abnormal medication vectors corresponding to all abnormal hospitals can be obtained, and the zero vector is used as the abnormal medication vector of the normal hospital in the embodiment of the invention.
Further, the step of obtaining the abnormal feature stability degree according to the abnormal medication vector sequence comprises: and obtaining the modulus of each abnormal medication vector in the abnormal medication vector sequence to obtain an abnormal medication degree sequence, and obtaining the stability degree of the abnormal characteristics according to the proportion of non-zero elements in the abnormal medication degree sequence.
Specifically, in all data acquired after the target hospital traverses n blocks, the abnormal medication vector of the hospital k is recorded as v (n), the module length l (n) of the abnormal medication vector is calculated, and the module length is used as the abnormal medication degree of the corresponding abnormal medication vector. When the number of traversed blocks is different, different abnormal medication vectors and abnormal medication degrees are obtained. And forming a sequence by the abnormal medication vectors obtained when the number n of the traversal blocks takes different values, and recording the sequence as an abnormal medication vector sequence.
Because the value of the block number n in the traversal process is different, the dimension of each abnormal vector sequence is also different. In the embodiment of the present invention, a threshold th1 of the number of blocks is set, and when the number n of blocks is greater than the threshold th1, the abnormal medication vector sequence and the corresponding abnormal medication degree sequence are obtained. Acquiring the number of non-zero elements in the abnormal medication degree sequence, calculating the ratio of the number of the non-zero elements to the number of the elements in the abnormal medication degree sequence, obtaining the proportion of the non-zero elements in the abnormal medication degree sequence, and taking the proportion as the abnormal characteristic stability degree of the hospital k.
It should be noted that, in the embodiment of the present invention, the threshold th1 of the number of blocks is set to be th 1-1000, the traversal time is set to be 3 minutes, at least 1000 blocks are traversed within 3 minutes by default, a dimension of the abnormal medication vector sequence corresponding to the maximum number of blocks is used as a final dimension, and when the dimension of the abnormal medication vector sequence is smaller than the final dimension, the abnormal medication vector sequence is filled with zero elements.
Acquiring abnormal feature stability degrees of all abnormal hospitals in data obtained after traversing n blocks based on the principle of acquiring the same abnormal feature stability degree of the hospital k, setting a maximum threshold and a minimum threshold of the abnormal feature stability degrees, and when the abnormal feature stability degree is greater than the maximum threshold, taking the corresponding hospital as a concerned hospital; and when the abnormal feature stability degree is smaller than the minimum threshold value, the corresponding hospital is used as a reference hospital, so that the attention hospital set and the reference hospital set are divided.
Preferably, in the embodiment of the present invention, the maximum threshold of the degree of stability of the abnormal feature is set to 0.7, and the minimum threshold is set to 0.3.
Step S300, obtaining the similarity between the abnormal medication vector sequences of the reference hospitals corresponding to any two target hospitals, and dividing the reference hospital set into a plurality of similar categories according to the similarity; and acquiring a first fusion vector according to the abnormal feature stability degree and the abnormal medication vector corresponding to the reference hospital in each similar category.
The attention hospital set and the reference hospital set are acquired according to the step S200. After the target hospital traverses a certain number of blocks, the acquired data change, so that the concerned hospital and the reference hospital correspondingly change, abnormal medication vectors of the concerned hospital and the reference hospital need to be recalculated, and the data acquired when each target hospital traverses the last block is used as the concerned hospital set and the reference hospital set and the abnormal medication vector sequence of the hospitals therein.
Preferably, in the embodiment of the present invention, the focus hospital and the reference hospital are recalculated every time the target hospital traverses 100 blocks.
It should be noted that after the target hospital has traversed 1000 blocks, recalculation is required for each 100 blocks traversed, and the last block traversed is the last block within 3 minutes.
Furthermore, each target hospital performs traversal and data extraction on the history block within 3 minutes of a default time period, and acquires a focus hospital set and a reference hospital set corresponding to each target hospital and an abnormal medication vector sequence of each hospital.
For the hospital k, the data result obtained by each target hospital is different from the abnormal medication vector sequence corresponding to the hospital k, so that the data of the hospital k are fused according to the data result obtained by each target hospital. And dividing the reference hospital set into different similar categories by utilizing the similarity between the reference hospital sets, and fusing each similar category so as to judge whether the hospital k has serious abnormal medication conditions.
Wherein, the step of obtaining the similarity comprises: acquiring an intersection between two reference hospital sets corresponding to the target hospitals, wherein each hospital in the intersection corresponds to two abnormal medication vector sequences, calculating an intersection ratio between the two reference hospital sets and the abnormal characteristic stability degree of the two abnormal medication vector sequences, and acquiring the similarity according to the intersection ratio and the two abnormal characteristic stability degrees.
Specifically, an intersection area between reference hospital sets corresponding to any two target hospitals is obtained, each reference hospital in the intersection area belongs to two reference hospital sets, namely, corresponds to two target hospitals, each reference hospital corresponds to two abnormal medication vector sequences, abnormal feature stability degrees of the two abnormal medication vector sequences are obtained respectively, and similarity between the two reference hospital sets is specifically as follows:
Figure BDA0003285218130000071
wherein S ismRepresenting a reference hospital set corresponding to the target hospital m; spRepresenting a reference hospital set corresponding to the target hospital p; sim (S)m、Sp) Reference hospital set SmAnd reference Hospital Collection SpThe similarity between them; s1 denotes a reference hospital set SmAnd reference Hospital Collection SpThe intersection region between; iou denotes the reference hospital set SmAnd reference hospitalCollection SpCross-over ratio of (a); dm(x) Reference hospital set SmThe abnormal characteristic stability degree of the middle reference hospital; dp(x) Reference hospital set SpThe abnormal characteristic stability degree of the middle reference hospital; dm(a) Reference hospital set SmOptionally referring to the degree of stability of abnormal features of the hospital a; dp(b) Reference hospital set SpOptionally referring to the degree of stability of abnormal features of hospital b; i SmI denotes the reference Hospital Collection SmThe number of all reference hospitals in the hospital; i SpI denotes the reference Hospital Collection SpNumber of all reference hospitals in (c).
In addition, d ism(x)+dp(x) Representing the sum of the degrees of anomalous feature stability of the elements x in the intersection region, exp (- (d) is used since it is of interest in an embodiment of the invention whether the degrees of anomalous feature stability of all elements in the intersection region are sufficiently small or notm(x)+dp(x) ) to perform a negative correlation mapping. If sigmax∈S1exp(-(dm(x)+dp(x) ) is sufficiently large, it means that the hospital medication in the pooling area is normal, and the size of the pooling ratio iou needs to be paid attention if referring to the hospital set SmAnd reference Hospital Collection SpIs sufficiently larger than iou, then the hospital set S is referencedmAnd reference Hospital Collection SpThe similarity of the two groups is higher;
Figure BDA0003285218130000081
reference hospital set SmWhether the degree of stability of the abnormal features of all the elements in the list is small enough;
Figure BDA0003285218130000082
reference hospital set SpWhether the degree of stability of the abnormal features of all the elements in the list is small enough; when referring to hospital collection SmAnd reference Hospital Collection SpWhen the abnormal features of the reference hospital in the intersection set are small in stability degree and large in intersection, the reference hospital set SmAnd reference Hospital Collection SpThe similarity between them is high.
For hospital k, which corresponds to multiple target hospitals and reference hospital sets, all reference hospital sets are divided into multiple similar categories.
In the embodiment of the invention, reference hospital sets are divided in a spectral clustering mode, specifically, each reference hospital set is used as a node, the nodes are connected, the similarity between the nodes is used as a side weight, a side weight threshold is set, and the corresponding reference hospitals are disconnected when the side weight is smaller than the side weight threshold, so that graph data is obtained; and performing spectral clustering on the graph data to obtain a plurality of similar categories, wherein each similar category is a node set, and nodes belonging to the same similar category have stronger similarity.
Preferably, since the value of the similarity is directly related to the number of hospitals in the intersection region, the threshold of the side weight is set to three fifths of the number of hospitals in the intersection region in the embodiment of the present invention.
For any one similar category, C1, assume that the element in this category is C1 ═ { C1,c2,…,ci…, obtaining the ratio of the abnormal feature stability degree of any reference hospital in the similar category to the sum of the abnormal feature stability degrees of all reference hospitals in the similar category, and taking the ratio as the first weight of the abnormal medication vector corresponding to the reference hospital to obtain a first fusion vector.
The first weight is specifically:
Figure BDA0003285218130000083
wherein alpha isiRepresenting a first weight; dC1(ci) Indicating the degree of stability of the abnormal features of the ith reference hospital in the similar category C1.
Further, the first fusion vector specifically includes:
Figure BDA0003285218130000084
wherein z (C1) represents in the similar category C1The first fused vector of (a); alpha is alphaiRepresenting a first weight; c. CiIndicating an abnormal medication vector for the ith reference hospital in the similar category C1.
And S400, acquiring the total variation of the abnormal feature stability degree and the mean value of the abnormal feature stability degree in the reference hospital set, and acquiring a second fusion vector according to the first fusion vector, the total variation and the mean value of the reference hospital set.
Specifically, a reference hospital set in each similar category is used as node composition graph data, and the value of each node in the graph data is the abnormal feature stability degree of the abnormal medication vector sequence corresponding to the node.
Acquiring the total variation L (C1) of the graph data and the mean value D (C1) of the sizes of the nodes in the graph data, wherein the smaller the total variation is, the smoother the graph data is, and the smaller the data change is; and acquiring the ratio of the average value to the total variation, and acquiring a second fusion vector by taking the ratio as the second weight of the first fusion vector.
Specifically, the second weight is:
Figure BDA0003285218130000091
wherein β (C1) represents the second weight; d (C1) represents the mean of similar categories C1; l (C1) represents the total variation for similar category C1; c represents a set of similar categories.
The second fused vector is then:
z=∑C1∈Cβ(C1)z(C1)
wherein Z represents a second fused vector; β (C1) represents a second weight; z (C1) represents a first fused vector; c represents a set of similar categories.
Step S500, acquiring the abnormal degree of the hospital according to the second fusion vector, and acquiring the acceptable degree of the hospital according to the abnormal degree and the workload; the workload is the ratio of the number of blocks traversed by the hospital to the total number of blocks on the block chain; the corresponding hospital to which the reward is given is decided according to the degree of acceptability.
Specifically, a second fusion vector of hospital k is obtained according to step S400, the length of the second fusion vector is used as the abnormal medication degree of hospital k, and when the value of the abnormal medication degree is larger, it indicates that hospital k has a serious condition that the drug is unreasonably used or the drug is abused. And adding the abnormal medication degrees of all the sick diagnosis results corresponding to the hospital k to obtain the abnormal degree of the hospital k.
And acquiring the product of the abnormal degree and the workload of the hospital, and taking the product as the acceptable degree.
Specifically, sorting from small to large according to the abnormal degrees, acquiring hospitals corresponding to the first K abnormal degrees, defining the workload of each hospital as the ratio of the number of traversal blocks when the hospital is used as a target hospital to the total number of blocks on a block chain, multiplying the abnormal degree corresponding to the hospital by the workload of the hospital, and taking the finally acquired product as the acceptable degree of the corresponding hospital.
It should be noted that the acceptable levels of the rest of hospitals except the hospitals corresponding to the first K abnormal levels are 0.
Acquiring a hospital with the maximum acceptable degree, packaging medical data generated by the hospital into blocks, connecting the blocks to a block chain, and giving rewards to the hospital, wherein the rewards are realized by discounting the drug price and the discount strength is 0.8; for hospitals with small acceptance degree, the price of the medicine is higher than that of the medicine for hospitals with large acceptance degree, so that the medicine price competition capability among the hospitals with large acceptance degree can be improved, and users with small acceptance degree are gradually eliminated, namely, hospitals with abused medicines and unreasonable medicine prices are gradually eliminated.
In summary, in the embodiment of the present invention, hospitals are divided into a normal hospital set and an abnormal hospital set by obtaining the drug use feature vector of each hospital, the abnormal feature stability of the abnormal hospital in the abnormal hospital set is obtained, and the hospitals are divided into a reference hospital set and a focus hospital set according to the abnormal feature stability; and further dividing the reference hospital set into a plurality of similar categories according to the similarity between the abnormal medication vector sequences among the reference hospitals, acquiring a first fusion vector of each similar category, acquiring a second fusion vector according to the total variation and the mean value of the first fusion vector and the reference hospital set, further acquiring the acceptable degree of each hospital, and distributing rewards to the hospitals with large acceptable degrees. More useful information is obtained according to analysis and mining of big data, and the useful information is fused to judge whether drugs are abused or not and the medicine price is unreasonable among hospitals, so that malignant medical competition is improved better.
Based on the same inventive concept as the method embodiment, the embodiment of the present invention further provides a medical data storage and big data mining system based on the block chain technology, and the system includes: a processor, a memory, and a computer program stored in the memory and executable on the processor. The processor, when executing the computer program, implements the steps of one embodiment of the method for storing and mining big data based on the block chain technology, such as the steps shown in fig. 1. The method for storing medical data and mining big data based on the block chain technology has been described in detail in the above embodiments, and is not described again.
It should be noted that: the precedence order of the above embodiments of the present invention is only for description, and does not represent the merits of the embodiments. And specific embodiments thereof have been described above. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (9)

1. A medical data storage and big data mining method based on a block chain technology is characterized by comprising the following steps:
traversing medical data of each block on the block chain by the target hospital to obtain a medicine use characteristic vector, wherein the block is a block for storing the medical data by the hospital; the medicine use characteristic vector is the appearance proportion of the medical data; dividing traversed hospitals into a normal hospital set and an abnormal hospital set according to the medicine use characteristic vector;
acquiring the mean value of the characteristic vectors of the traditional Chinese medicines in the normal hospital set, acquiring the abnormal medication vector of each abnormal hospital in the abnormal hospital set according to the mean value, acquiring the abnormal characteristic stability degree according to the abnormal medication vector sequence, and classifying the hospitals into an attention hospital set and a reference hospital set according to the abnormal characteristic stability degree;
acquiring the similarity between abnormal medication vector sequences of reference hospitals corresponding to any two target hospitals, and dividing the reference hospital set into a plurality of similar categories according to the similarity; acquiring a first fusion vector according to the abnormal feature stability degree and the abnormal medication vector of the reference hospital in each similar category;
acquiring the total variation of the abnormal feature stability degree and the mean value of the abnormal feature stability degree in the reference hospital set, and acquiring a second fusion vector according to the first fusion vector, the total variation and the mean value of the reference hospital set;
acquiring the abnormal degree of the hospital according to the second fusion vector, and acquiring the acceptable degree of the hospital according to the abnormal degree and the workload; the workload is the ratio of the number of the blocks traversed by the target hospital to the total number of blocks on a block chain; the corresponding hospital to which the reward is given is determined according to the acceptable degree of each hospital.
2. The method of claim 1, wherein the step of obtaining a normal hospital set and an abnormal hospital set according to the feature vector comprises:
and clustering the traversed feature vectors of all the hospitals, wherein the feature vectors are intensively distributed to be a normal hospital set, and the rest are abnormal hospital sets.
3. The method of claim 1, wherein the step of obtaining a sequence of abnormal medication vector for each hospital in the abnormal hospital set further comprises:
and acquiring the mean value of all the feature vectors in the normal hospital set, calculating the difference value between each feature vector in the abnormal hospital set and the mean value, and taking the absolute value of the difference value to obtain the abnormal medication vector corresponding to the abnormal hospital.
4. The method of claim 1, wherein said step of obtaining the degree of stability of the abnormal features from the sequence of abnormal medication vectors comprises:
and obtaining the modulus of each abnormal medication vector in the abnormal medication vector sequence to obtain an abnormal medication degree sequence, and obtaining the stability degree of the abnormal characteristic according to the proportion of non-zero elements in the abnormal medication degree sequence.
5. The method of claim 1, wherein the step of obtaining the similarity between the reference hospital sets corresponding to any two hospitals comprises:
acquiring an intersection between two reference hospital sets corresponding to two target hospitals, wherein each hospital in the intersection corresponds to two abnormal medication vector sequences, calculating an intersection ratio between the two reference hospital sets and an abnormal characteristic stability degree of the two abnormal medication vector sequences, and acquiring the similarity according to the intersection ratio and the two abnormal characteristic stability degrees.
6. The method of claim 1, wherein the step of obtaining a first fused vector based on the abnormal feature stability and abnormal medication vector of the reference hospital in each similar category comprises:
and acquiring a ratio of the abnormal feature stability degree of any reference hospital in the similar category to the sum of the abnormal feature stability degrees of all reference hospitals in the similar category, and acquiring the first fusion vector by taking the ratio as a first weight of the abnormal medication vector corresponding to the reference hospital.
7. The method of claim 1, wherein the step of obtaining a second fused vector based on the first fused vector, the total variation and the mean value comprises:
and acquiring a ratio of the mean value to the total variation, and acquiring a second fusion vector by taking the ratio as a second weight of the first fusion vector.
8. The method of claim 1, wherein said step of obtaining said acceptable level for said hospital based on said degree of abnormal medication and workload comprises:
and acquiring the product of the abnormal medication degree of the hospital and the workload, wherein the product is the acceptable degree.
9. A medical data storage and big data mining system based on block chain technology, comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor when executing the computer program implements the steps of the method according to any one of claims 1 to 8.
CN202111144685.3A 2021-09-28 2021-09-28 Medical data storage and big data mining method and system based on block chain technology Active CN113822365B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111144685.3A CN113822365B (en) 2021-09-28 2021-09-28 Medical data storage and big data mining method and system based on block chain technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111144685.3A CN113822365B (en) 2021-09-28 2021-09-28 Medical data storage and big data mining method and system based on block chain technology

Publications (2)

Publication Number Publication Date
CN113822365A true CN113822365A (en) 2021-12-21
CN113822365B CN113822365B (en) 2023-09-05

Family

ID=78921618

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111144685.3A Active CN113822365B (en) 2021-09-28 2021-09-28 Medical data storage and big data mining method and system based on block chain technology

Country Status (1)

Country Link
CN (1) CN113822365B (en)

Citations (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080215402A1 (en) * 2005-09-22 2008-09-04 Pearson Ronald K Methods and Systems for Evaluating Interaction of Medical Products and Dependence on Demographic Variables
US20110295621A1 (en) * 2001-11-02 2011-12-01 Siemens Medical Solutions Usa, Inc. Healthcare Information Technology System for Predicting and Preventing Adverse Events
US20120166212A1 (en) * 2010-10-26 2012-06-28 Campbell Stanley Victor System and method for machine based medical diagnostic code identification, accumulation, analysis and automatic claim process adjudication
US20130166572A1 (en) * 2010-06-28 2013-06-27 Nec Corporation Device, method, and program for extracting abnormal event from medical information
US20150286783A1 (en) * 2014-04-02 2015-10-08 Palo Alto Research Center Incorporated Peer group discovery for anomaly detection
US20150356252A1 (en) * 2013-01-16 2015-12-10 Medaware Ltd. Medical database and system
US9401021B1 (en) * 2011-12-14 2016-07-26 Atti International Services Company, Inc. Method and system for identifying anomalies in medical images especially those including body parts having symmetrical properties
US9779504B1 (en) * 2011-12-14 2017-10-03 Atti International Services Company, Inc. Method and system for identifying anomalies in medical images especially those including one of a pair of symmetric body parts
WO2018058545A1 (en) * 2016-09-30 2018-04-05 曹庆恒 Service unit data feature-based prescription control data standard management system
CN108806780A (en) * 2018-06-14 2018-11-13 四川久远银海软件股份有限公司 A kind of exception medical expense judgment method and device
CN109119137A (en) * 2018-08-24 2019-01-01 腾讯科技(深圳)有限公司 A kind of method for detecting abnormality, device, server and storage medium
CN109473177A (en) * 2018-10-31 2019-03-15 平安科技(深圳)有限公司 The method and Related product of medical development trend are determined based on prediction model
CN109545317A (en) * 2018-10-30 2019-03-29 平安科技(深圳)有限公司 The method and Related product of behavior in hospital are determined based on prediction model in hospital
CN109636613A (en) * 2018-10-19 2019-04-16 平安医疗健康管理股份有限公司 Medical data abnormality recognition method, device, terminal and storage medium
CN109635044A (en) * 2018-12-13 2019-04-16 平安医疗健康管理股份有限公司 Hospitalization data method for detecting abnormality, device, equipment and readable storage medium storing program for executing
CN110648734A (en) * 2018-06-27 2020-01-03 清华大学 Method and device for identifying abnormal cases in medical treatment based on mean value
CN111785384A (en) * 2020-06-29 2020-10-16 平安医疗健康管理股份有限公司 Abnormal data identification method based on artificial intelligence and related equipment
CN111986770A (en) * 2020-08-31 2020-11-24 平安医疗健康管理股份有限公司 Prescription medication auditing method, device, equipment and storage medium
CN111986037A (en) * 2020-08-31 2020-11-24 平安医疗健康管理股份有限公司 Method, device and equipment for monitoring medical insurance audit data and storage medium
US20210257066A1 (en) * 2019-03-07 2021-08-19 Ping An Technology (Shenzhen) Co., Ltd. Machine learning based medical data classification method, computer device, and non-transitory computer-readable storage medium

Patent Citations (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110295621A1 (en) * 2001-11-02 2011-12-01 Siemens Medical Solutions Usa, Inc. Healthcare Information Technology System for Predicting and Preventing Adverse Events
US20080215402A1 (en) * 2005-09-22 2008-09-04 Pearson Ronald K Methods and Systems for Evaluating Interaction of Medical Products and Dependence on Demographic Variables
US20130166572A1 (en) * 2010-06-28 2013-06-27 Nec Corporation Device, method, and program for extracting abnormal event from medical information
US20120166212A1 (en) * 2010-10-26 2012-06-28 Campbell Stanley Victor System and method for machine based medical diagnostic code identification, accumulation, analysis and automatic claim process adjudication
US9401021B1 (en) * 2011-12-14 2016-07-26 Atti International Services Company, Inc. Method and system for identifying anomalies in medical images especially those including body parts having symmetrical properties
US9779504B1 (en) * 2011-12-14 2017-10-03 Atti International Services Company, Inc. Method and system for identifying anomalies in medical images especially those including one of a pair of symmetric body parts
US20150356252A1 (en) * 2013-01-16 2015-12-10 Medaware Ltd. Medical database and system
US20150286783A1 (en) * 2014-04-02 2015-10-08 Palo Alto Research Center Incorporated Peer group discovery for anomaly detection
WO2018058545A1 (en) * 2016-09-30 2018-04-05 曹庆恒 Service unit data feature-based prescription control data standard management system
CN108806780A (en) * 2018-06-14 2018-11-13 四川久远银海软件股份有限公司 A kind of exception medical expense judgment method and device
CN110648734A (en) * 2018-06-27 2020-01-03 清华大学 Method and device for identifying abnormal cases in medical treatment based on mean value
CN109119137A (en) * 2018-08-24 2019-01-01 腾讯科技(深圳)有限公司 A kind of method for detecting abnormality, device, server and storage medium
CN109636613A (en) * 2018-10-19 2019-04-16 平安医疗健康管理股份有限公司 Medical data abnormality recognition method, device, terminal and storage medium
CN109545317A (en) * 2018-10-30 2019-03-29 平安科技(深圳)有限公司 The method and Related product of behavior in hospital are determined based on prediction model in hospital
CN109473177A (en) * 2018-10-31 2019-03-15 平安科技(深圳)有限公司 The method and Related product of medical development trend are determined based on prediction model
CN109635044A (en) * 2018-12-13 2019-04-16 平安医疗健康管理股份有限公司 Hospitalization data method for detecting abnormality, device, equipment and readable storage medium storing program for executing
US20210257066A1 (en) * 2019-03-07 2021-08-19 Ping An Technology (Shenzhen) Co., Ltd. Machine learning based medical data classification method, computer device, and non-transitory computer-readable storage medium
CN111785384A (en) * 2020-06-29 2020-10-16 平安医疗健康管理股份有限公司 Abnormal data identification method based on artificial intelligence and related equipment
CN111986770A (en) * 2020-08-31 2020-11-24 平安医疗健康管理股份有限公司 Prescription medication auditing method, device, equipment and storage medium
CN111986037A (en) * 2020-08-31 2020-11-24 平安医疗健康管理股份有限公司 Method, device and equipment for monitoring medical insurance audit data and storage medium

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
赵志南;: "基于数据挖掘在医疗中的应用分析探讨", 信息通信, no. 09 *
陈晓凤;: "基于层次分析法的用药合理性分析", 科技情报开发与经济, no. 31 *
魏志杰;金涛;王建民;: "基于临床数据挖掘的医疗过程异常发现方法及应用", 计算机集成制造系统, no. 07 *
龚卫宁;: "数据挖掘在医院管理中的应用", 中国医药指南, no. 12 *

Also Published As

Publication number Publication date
CN113822365B (en) 2023-09-05

Similar Documents

Publication Publication Date Title
Glicksberg et al. Automated disease cohort selection using word embeddings from Electronic Health Records
RU2533500C2 (en) System and method for combining clinical signs and image signs for computer-aided diagnostics
JP6066826B2 (en) Analysis system and health business support method
Güvenir et al. Learning differential diagnosis of erythemato-squamous diseases using voting feature intervals
CN107785057B (en) Medical data processing method, device, storage medium and computer equipment
Perkonigg et al. Dynamic memory to alleviate catastrophic forgetting in continual learning with medical imaging
US11488717B2 (en) Method and system for analysis of spine anatomy and spine disease
JP7430295B2 (en) Individual chronic disease progression risk visualization evaluation method and system
US20210313063A1 (en) Machine learning models for gaps in care and medication actions
TWI814154B (en) Method for predicting disease based on medical image
WO2015071968A1 (en) Analysis system
KR101565331B1 (en) Analyzing system for medical informations using patterns and the method thereof
CN107978343A (en) The evaluation system of electronic health record
US20220058749A1 (en) Medical fraud, waste, and abuse analytics systems and methods
Lin et al. Intelligent physician segmentation and management based on KDD approach
CN113822365A (en) Medical data storage and big data mining method and system based on block chain technology
Huang et al. Dimensionality reduction for knowledge discovery in medical claims database: application to antidepressant medication utilization study
Hamou et al. Cluster analysis of MR imaging in Alzheimer’s disease using decision tree refinement
Pham The recurrence dynamics of personalized depression
CN114098638B (en) Interpretable dynamic disease severity prediction method
Goldstein et al. Classifying individuals based on a densely captured sequence of vital signs: An example using repeated blood pressure measurements during hemodialysis treatment
CN111710431B (en) Method, device, equipment and storage medium for identifying synonymous diagnosis names
Bulant Computational models for the geometric and functional assessment of the coronary circulation
Konijn Detecting interesting differences: Data mining in health insurance data using outlier detection and subgroup discovery
JP7472226B1 (en) Information processing device and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20230602

Address after: Room 712, Building A1-1, A-1 District, Dong'an Kaiyun Fuli, Chaoyang District, Changchun City, Jilin Province, 130000

Applicant after: Jilin Deyuan Medical Technology Co.,Ltd.

Address before: 210024 No. 300, Guangzhou Road, Nanjing, Jiangsu

Applicant before: Liu Yupeng

TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20230808

Address after: Room 4003, 4004, and 4005, 4th floor, Building 2, Yard 11, Xinhua East Street, Tongzhou District, Beijing, 101199

Applicant after: Beijing Hengsheng Yuntai Network Technology Co.,Ltd.

Address before: Room 712, Building A1-1, A-1 District, Dong'an Kaiyun Fuli, Chaoyang District, Changchun City, Jilin Province, 130000

Applicant before: Jilin Deyuan Medical Technology Co.,Ltd.

GR01 Patent grant
GR01 Patent grant