CN109087711A - Medical big data method for digging and system - Google Patents
Medical big data method for digging and system Download PDFInfo
- Publication number
- CN109087711A CN109087711A CN201810684758.XA CN201810684758A CN109087711A CN 109087711 A CN109087711 A CN 109087711A CN 201810684758 A CN201810684758 A CN 201810684758A CN 109087711 A CN109087711 A CN 109087711A
- Authority
- CN
- China
- Prior art keywords
- data
- medical
- patient
- module
- digging
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Public Health (AREA)
- Biomedical Technology (AREA)
- Databases & Information Systems (AREA)
- Pathology (AREA)
- Epidemiology (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Measuring And Recording Apparatus For Diagnosis (AREA)
Abstract
The invention belongs to the technical fields of medical big data, and in particular to medical big data method for digging and system;Wherein the technical issues of decision are as follows: a kind of subjectivity influence, the medical big data method for digging of reduction misdetection rate/error rate and system that interest measure is effectively reduced and selects is provided;The technical solution of use are as follows: medical big data method for digging, comprising: acquire the medical data of patient, in which: the medical data of the patient includes: behavioral data, clinical data, cost data and insurance data;Structural data is converted by the medical data of every patient;Establish the relevant database centered on patient;Database is cleaned, by Missing Data Filling or is filtered out;The data that will be cleaned are calculated based on interestingness measure standard not of the same race, obtain different entertaining rules;Different interest measures is clustered using Fuzzy C-Means Cluster Algorithm, the degree of membership of every kind of interest measure after being optimized.
Description
Technical field
The invention belongs to the technical fields of medical big data, and in particular to medical big data method for digging and system.
Background technique
Nowadays it is the epoch of a big data, big data is applied into the hot spot that medical domain has become scientific research;Doctor
Big data is treated with greatly value, excavates the value information in medical big data for medical diagnosis on disease, therapeutic scheme determination, stream
Row disease forecasting, medical research and drug side-effect analysis etc. have great importance;In a sense, the big number of medical treatment
According to system for improving human habitat, improving the quality of living, obtain higher happiness and refer to there is important role.
Want that big data is preferably applied to medical domain, the accurate application of medical big data association mining method seems
It is particularly important, a unsuitable association mining method, may obtain between disease and disease, between disease and symptom, symptom
Erroneous association between index and between other relationships, so that final research achievement be made deviation occur.
However, only limit uses a kind of interestingness measure to medical big data association rule mining method existing at present mostly,
The attribute of different metric forms and the research of behavior are focused in most of researchs, but different interestingness measures are in different applied fields
Under scape, performance is different, limits the ability in medical big data association rule mining using limitation;At the same time,
In order to be worth the medical big data obtained more, the relevant data source of medical treatment in integrated multi-party face as much as possible is needed, is passed
The single interestingness measure of system cannot meet its association rule mining demand well.
Summary of the invention
The present invention overcomes the shortcomings of the prior art, technical problem to be solved are as follows: provide one kind be effectively reduced it is emerging
The subjectivity of interesting metric sebection influences, reduces the medical big data method for digging of misdetection rate/error rate and system.
In order to solve the above-mentioned technical problem, the technical solution adopted by the present invention are as follows:
Medical big data method for digging, including the following steps: acquire the medical data of patient, in which: the doctor of the patient
Treating data includes: behavioral data, clinical data, cost data and insurance data;Knot is converted by the medical data of every patient
Structure data;Establish the relevant database centered on patient;Database is cleaned, by Missing Data Filling or filter
It removes;The data that will be cleaned are calculated based on interestingness measure standard not of the same race, obtain different entertaining rules;It utilizes
Fuzzy C-Means Cluster Algorithm clusters different interest measures, the degree of membership of every kind of interest measure after being optimized.
Preferably, described to convert structural data for the medical data of every patient, it specifically includes: by every patient
Medical data is divided into structural data and unstructured data;Structural data is converted by unstructured data.
Preferably, database is cleaned, by Missing Data Filling or filters out, specifically includes: calculated using linear difference
Method or according to data distribution characteristics, is filled with one in mode, median, average value, maximum value, minimum value;Data
It lacks serious, directly filters out.
Correspondingly, medical big data digging system, comprising: acquisition module, for acquiring the medical data of patient, in which:
The medical data of the patient includes: behavioral data, clinical data, cost data and insurance data;Data conversion module is used for
Structural data is converted by the medical data of every patient;Module is established, for establishing the relationship type number centered on patient
According to library;Cleaning module by Missing Data Filling or is filtered out for cleaning to database;Extraction module, for that will clean
Data, calculated based on interestingness measure standard not of the same race, obtain different entertaining rules;Module is integrated, for benefit
Different interest measures is clustered with Fuzzy C-Means Cluster Algorithm, the degree of membership of every kind of interest measure after being optimized.
Preferably, the data conversion module includes: that the medical data of every patient is divided into structural data and non-
Structural data;Structural data is converted by unstructured data.
Preferably, the cleaning module specifically includes: filling module, for using linear difference algorithm or according to data
Distribution characteristics is filled with one in mode, median, average value, maximum value, minimum value;Module is filtered out, for data
It lacks serious, directly filters out.
Compared with the prior art, the invention has the following beneficial effects:
The present invention is based on patient medical datas, by interest measure criterion calculation not of the same race, and utilize fuzzy C-means clustering
Algorithm, the rate score of every kind of interest measure after being optimized calculate the contribution ranking behavior of each interest measure;Entire method
Most suitable interestingness measure is synthesized or selects for specific medical data mining task based on data-driven, it can be effective
The subjectivity for reducing interest measure selection influences, reduces misdetection rate/error rate.
Detailed description of the invention
The present invention will be further described in detail with reference to the accompanying drawing;
Fig. 1 is the flow diagram for the medical big data method for digging that the embodiment of the present invention one provides;
Fig. 2 is the structural schematic diagram for the medical big data digging system that the embodiment of the present invention one provides;
Fig. 3 is the structural schematic diagram of medical big data digging system provided by Embodiment 2 of the present invention;
Fig. 4 is the storage mode schematic diagram provided by Embodiment 2 of the present invention for establishing module;
In figure: 10 be acquisition module, and 20 be data conversion module, and 30 is establish module, and 40 be cleaning module, and 50 be extraction
Module, 60 is integrate module, and 401 be filling module, and 402 be to filter out module.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention
In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is
A part of the embodiments of the present invention, instead of all the embodiments;Based on the embodiments of the present invention, ordinary skill people
Member's every other embodiment obtained without creative efforts, shall fall within the protection scope of the present invention.
Fig. 1 is the flow diagram for the medical big data method for digging that the embodiment of the present invention one provides, as shown in Figure 1, doctor
Treat big data method for digging, including the following steps: acquire the medical data of patient, in which: the medical data of the patient includes:
Behavioral data, clinical data, cost data and insurance data;Structural data is converted by the medical data of every patient;It builds
The vertical relevant database centered on patient;Database is cleaned, by Missing Data Filling or is filtered out;By what is cleaned
Data are calculated based on interestingness measure standard not of the same race, obtain different entertaining rules;Utilize fuzzy C-means clustering
Algorithm clusters different interest measures, the degree of membership of every kind of interest measure after being optimized.
In the present embodiment one, the medical data of the acquisition patient by using different types of medical supply or can be
System, such as: B ultrasound, CT, magnetic resonance, electrocardio, brain electricity, portable wearable device, hospital information system, acquisition magnanimity patient are related
Information;Such as: by registering, questionnaire, obtaining the essential information of patient;By read in the modes such as case history obtain the medical of patient and
Medication information;By connecting hospital information management system, cost and the insurance information of patient are obtained;Finally composition is with patient
Each different block of informations at center.
Specifically, described to convert structural data for the medical data of every patient, it specifically includes: by every patient
Medical data is divided into structural data and unstructured data;Structural data is converted by unstructured data;This implementation
In example one, the patient medical data of acquisition include: structural data (such as: the essential information of patient, all kinds of clinical examination indexs
Deng) and unstructured data (such as case history archive (e.g.xml), clinical medicine picture, the audit report of various text versions
Deng);It needs to select specific information extraction mode to convert unstructured data for different unstructured data types
For structural data;Such as: picture, the video in patient-related data are carried out structuring processing by the algorithm based on deep learning,
The data of character express type are subjected to processing conversion using natural language processing technique.
Further, database is cleaned, by Missing Data Filling or filters out, specifically includes: utilizing linear difference
Algorithm or according to data distribution characteristics, is filled with one in mode, median, average value, maximum value, minimum value;Number
It is serious according to lacking, directly filter out.
In the present invention, by establishing the relevant database centered on patient, pass through the disease from hospital management system
People's code of going to a doctor (can be used as the major key of one patient of identification, Primary key can each to record in unique identification database table
Patient).
It influenced for the subjectivity that interest measure selects is effectively reduced, reduce misdetection rate/error rate as under different medical scene
The medical related data of generation only cannot excavate dependency rule with single interest measure.
It indicates the medical related data occurred in set A with equation A → B below there is a strong possibility to appear in set B
In;Here the part interest measure used in us is as shown in the following chart:
The interest-degree scale in part of the present invention of table 1
After the pairwise distance between two interest measures obtains, the opposite behavior of variety classes interest measure is this hair
The emphasis of bright concern, the present invention in, select Fuzzy C-Means Cluster Algorithm different interest measures is clustered because its
The degree of membership of generation can not only measure difference of the different interest measures between different clusters and can also measure even if same
Difference in a cluster, between different interest measures.
The objective function of Fuzzy C-Means Cluster Algorithm is as follows:
Here N, c and m are the type of interest measure, the number and fuzzy factor of cluster respectively.xiAnd vjShow respectively
The cluster centre of i kind interest measure and j-th of interest measure;Fuzzy C-Means Cluster Algorithm is substantially above-mentioned in order to minimize
Objective function Q;It can be as follows by the method for continuous iteration:
Thus the degree of membership of every kind of interest measure after being optimized.
Degree of membership in the present invention reflect opposite behavior of different interest measures during rule association and they
Between difference;Value based on degree of membership, comprehensive analysis difference interest measure is during medical big data association rule mining
Role, to be directed to different medical care problems, selection or the more suitable interest measure mode of integration be can reduce
The subjectivity of interest measure selection influences, and reduces misdetection rate/error rate.It is comprehensive to improve data, accuracy and processing data
Efficiency.
Fig. 2 is the structural schematic diagram for the medical big data digging system that the embodiment of the present invention one provides, as shown in Fig. 2, doctor
Treat big data digging system, comprising:
Acquisition module (10), for acquiring the medical data of patient, in which: the medical data of the patient includes: behavior
Data, clinical data, cost data and insurance data;
Data conversion module (20), for converting structural data for the medical data of every patient;
Module (30) are established, for establishing the relevant database centered on patient;
Cleaning module (40) by Missing Data Filling or is filtered out for cleaning to database;
Extraction module (50), the data for that will clean are calculated based on interestingness measure standard not of the same race, are obtained
Obtain different entertaining rules;
Module (60) are integrated, for being clustered using Fuzzy C-Means Cluster Algorithm to different interest measures, are obtained excellent
The degree of membership of every kind of interest measure after change.
Specifically, the data conversion module (20) includes:
The medical data of every patient is divided into structural data and unstructured data;
Structural data is converted by unstructured data.
Fig. 3 is the structural schematic diagram of medical big data digging system provided by Embodiment 2 of the present invention, as shown in figure 3,
On the basis of embodiment one, the cleaning module (40) is specifically included: filling module (401), for utilizing linear difference algorithm
Or it according to data distribution characteristics, is filled with one in mode, median, average value, maximum value, minimum value;Filter out mould
Block (402), for shortage of data it is serious, directly filter out.
Finally, it should be noted that the above embodiments are only used to illustrate the technical solution of the present invention., rather than its limitations;To the greatest extent
Pipe present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that: its according to
So be possible to modify the technical solutions described in the foregoing embodiments, or to some or all of the technical features into
Row equivalent replacement;And these are modified or replaceed, various embodiments of the present invention technology that it does not separate the essence of the corresponding technical solution
The range of scheme.
Claims (6)
1. medical big data method for digging, it is characterised in that: include the following steps:
Acquire the medical data of patient, in which: the medical data of the patient includes: behavioral data, clinical data, cost data
And insurance data;
Structural data is converted by the medical data of every patient;
Establish the relevant database centered on patient;
Database is cleaned, by Missing Data Filling or is filtered out;
The data that will be cleaned are calculated based on interestingness measure standard not of the same race, obtain different entertaining rules;
Different interest measures is clustered using Fuzzy C-Means Cluster Algorithm, every kind of interest measure after being optimized
Degree of membership.
2. medical treatment big data method for digging according to claim 1, it is characterised in that: the medical number by every patient
According to structural data is converted into, specifically include:
The medical data of every patient is divided into structural data and unstructured data;
Structural data is converted by unstructured data.
3. medical treatment big data method for digging according to claim 1, it is characterised in that: clean, will lack to database
Mistake value is filled or is filtered out, and is specifically included:
Using linear difference algorithm or according to data distribution characteristics, in mode, median, average value, maximum value, minimum value
One fill;
Shortage of data is serious, directly filters out.
4. medical big data digging system, it is characterised in that: include:
Acquisition module (10), for acquiring the medical data of patient, in which: the medical data of the patient include: behavioral data,
Clinical data, cost data and insurance data;
Data conversion module (20), for converting structural data for the medical data of every patient;
Module (30) are established, for establishing the relevant database centered on patient;
Cleaning module (40) by Missing Data Filling or is filtered out for cleaning to database;
Extraction module (50), the data for that will clean are calculated based on interestingness measure standard not of the same race, are obtained not
Same entertaining rule;
Module (60) are integrated, for clustering using Fuzzy C-Means Cluster Algorithm to different interest measures, after obtaining optimization
Every kind of interest measure degree of membership.
5. medical treatment big data digging system according to claim 4, it is characterised in that: data conversion module (20) packet
It includes:
The medical data of every patient is divided into structural data and unstructured data;
Structural data is converted by unstructured data.
6. medical treatment big data digging system according to claim 4, it is characterised in that: the cleaning module (40) is specifically wrapped
It includes:
It fills module (401), for mode, median, being averaged using linear difference algorithm or according to data distribution characteristics
Value, maximum value, one in minimum value fill;
Filter out module (402), for shortage of data it is serious, directly filter out.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810684758.XA CN109087711A (en) | 2018-06-28 | 2018-06-28 | Medical big data method for digging and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810684758.XA CN109087711A (en) | 2018-06-28 | 2018-06-28 | Medical big data method for digging and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109087711A true CN109087711A (en) | 2018-12-25 |
Family
ID=64840000
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810684758.XA Pending CN109087711A (en) | 2018-06-28 | 2018-06-28 | Medical big data method for digging and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109087711A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113436027A (en) * | 2021-06-30 | 2021-09-24 | 山大地纬软件股份有限公司 | Medical insurance reimbursement abnormal data detection method and system |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101520878A (en) * | 2009-04-03 | 2009-09-02 | 华为技术有限公司 | Method, device and system for pushing advertisements to users |
CN103678672A (en) * | 2013-12-25 | 2014-03-26 | 北京中兴通软件科技股份有限公司 | Method for recommending information |
CN104268290A (en) * | 2014-10-22 | 2015-01-07 | 武汉科技大学 | Recommendation method based on user cluster |
CN106055908A (en) * | 2016-06-13 | 2016-10-26 | 武汉理工大学 | Personal medical information recommending method and system based on cloud computation |
CN106202891A (en) * | 2016-06-30 | 2016-12-07 | 电子科技大学 | A kind of big data digging method towards Evaluation of Medical Quality |
-
2018
- 2018-06-28 CN CN201810684758.XA patent/CN109087711A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101520878A (en) * | 2009-04-03 | 2009-09-02 | 华为技术有限公司 | Method, device and system for pushing advertisements to users |
CN103678672A (en) * | 2013-12-25 | 2014-03-26 | 北京中兴通软件科技股份有限公司 | Method for recommending information |
CN104268290A (en) * | 2014-10-22 | 2015-01-07 | 武汉科技大学 | Recommendation method based on user cluster |
CN106055908A (en) * | 2016-06-13 | 2016-10-26 | 武汉理工大学 | Personal medical information recommending method and system based on cloud computation |
CN106202891A (en) * | 2016-06-30 | 2016-12-07 | 电子科技大学 | A kind of big data digging method towards Evaluation of Medical Quality |
Non-Patent Citations (1)
Title |
---|
余辉: "医学知识获取与发现的研究", 《中国博士学位论文全文数据库》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113436027A (en) * | 2021-06-30 | 2021-09-24 | 山大地纬软件股份有限公司 | Medical insurance reimbursement abnormal data detection method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Egger et al. | Medical deep learning—A systematic meta-review | |
Tsumoto et al. | Similarity-based behavior and process mining of medical practices | |
CN107247881A (en) | A kind of multi-modal intelligent analysis method and system | |
Saeed et al. | An application of neutrosophic hypersoft mapping to diagnose brain tumor and propose appropriate treatment | |
Chen et al. | Metafed: Federated learning among federations with cyclic knowledge distillation for personalized healthcare | |
CN110085325B (en) | Method and device for constructing knowledge graph about traditional Chinese medicine experience data | |
Zhang et al. | Medical diagnosis data mining based on improved Apriori algorithm | |
CN116759041B (en) | Medical time sequence data generation method and device considering diagnosis and treatment event relationship | |
Gómez-Pulido et al. | Predicting the appearance of hypotension during hemodialysis sessions using machine learning classifiers | |
Chou et al. | Extracting drug utilization knowledge using self-organizing map and rough set theory | |
Yan et al. | Left temporal pole contributes to creative thinking via an individual semantic network | |
CN106295092A (en) | The multi-dimensional data of clinical treatment analyzes method and system | |
CN109087711A (en) | Medical big data method for digging and system | |
Siu et al. | An Intelligent clinical decision support system for assessing the needs of a long-term care plan | |
de Vries et al. | Wearable-measured sleep and resting heart rate variability as an outcome of and predictor for subjective stress measures: A multiple N-of-1 observational study | |
Ma et al. | Retrieval-based gradient boosting decision trees for disease risk assessment | |
Strobel et al. | Healthcare in the Era of Digital twins: towards a Domain-Specific Taxonomy. | |
US20170039295A1 (en) | Tribal abstraction network | |
Mitra et al. | Diagnosing Alzheimer’s Disease Using Deep Learning Techniques | |
Tsumoto et al. | Analytics for hospital management | |
Gulhane et al. | Machine Learning Approach for Early Disease Prediction and Risk Analysis | |
CN117059231B (en) | Method for machine learning of traditional Chinese medicine cases and intelligent diagnosis and treatment system | |
Wang et al. | Enhancing Quality of Patients Care and Improving Patient Experience in China with Assistance of Artificial Intelligence | |
Mazid | THE INTEGRATION OF BUSINESS INTELLIGENCE IN COMPANIES | |
Li et al. | Comparative Study of Domestic and Foreign Health Care Platform |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181225 |
|
RJ01 | Rejection of invention patent application after publication |