CN109087711A - Medical big data method for digging and system - Google Patents

Medical big data method for digging and system Download PDF

Info

Publication number
CN109087711A
CN109087711A CN201810684758.XA CN201810684758A CN109087711A CN 109087711 A CN109087711 A CN 109087711A CN 201810684758 A CN201810684758 A CN 201810684758A CN 109087711 A CN109087711 A CN 109087711A
Authority
CN
China
Prior art keywords
data
medical
patient
module
digging
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810684758.XA
Other languages
Chinese (zh)
Inventor
赵杰
李金博
李砺锋
张腾飞
薛文华
翟运开
宋晓琴
孙东旭
范智蕊
沈志博
朱子家
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
First Affiliated Hospital of Zhengzhou University
Original Assignee
First Affiliated Hospital of Zhengzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by First Affiliated Hospital of Zhengzhou University filed Critical First Affiliated Hospital of Zhengzhou University
Priority to CN201810684758.XA priority Critical patent/CN109087711A/en
Publication of CN109087711A publication Critical patent/CN109087711A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)

Abstract

The invention belongs to the technical fields of medical big data, and in particular to medical big data method for digging and system;Wherein the technical issues of decision are as follows: a kind of subjectivity influence, the medical big data method for digging of reduction misdetection rate/error rate and system that interest measure is effectively reduced and selects is provided;The technical solution of use are as follows: medical big data method for digging, comprising: acquire the medical data of patient, in which: the medical data of the patient includes: behavioral data, clinical data, cost data and insurance data;Structural data is converted by the medical data of every patient;Establish the relevant database centered on patient;Database is cleaned, by Missing Data Filling or is filtered out;The data that will be cleaned are calculated based on interestingness measure standard not of the same race, obtain different entertaining rules;Different interest measures is clustered using Fuzzy C-Means Cluster Algorithm, the degree of membership of every kind of interest measure after being optimized.

Description

Medical big data method for digging and system
Technical field
The invention belongs to the technical fields of medical big data, and in particular to medical big data method for digging and system.
Background technique
Nowadays it is the epoch of a big data, big data is applied into the hot spot that medical domain has become scientific research;Doctor Big data is treated with greatly value, excavates the value information in medical big data for medical diagnosis on disease, therapeutic scheme determination, stream Row disease forecasting, medical research and drug side-effect analysis etc. have great importance;In a sense, the big number of medical treatment According to system for improving human habitat, improving the quality of living, obtain higher happiness and refer to there is important role.
Want that big data is preferably applied to medical domain, the accurate application of medical big data association mining method seems It is particularly important, a unsuitable association mining method, may obtain between disease and disease, between disease and symptom, symptom Erroneous association between index and between other relationships, so that final research achievement be made deviation occur.
However, only limit uses a kind of interestingness measure to medical big data association rule mining method existing at present mostly, The attribute of different metric forms and the research of behavior are focused in most of researchs, but different interestingness measures are in different applied fields Under scape, performance is different, limits the ability in medical big data association rule mining using limitation;At the same time, In order to be worth the medical big data obtained more, the relevant data source of medical treatment in integrated multi-party face as much as possible is needed, is passed The single interestingness measure of system cannot meet its association rule mining demand well.
Summary of the invention
The present invention overcomes the shortcomings of the prior art, technical problem to be solved are as follows: provide one kind be effectively reduced it is emerging The subjectivity of interesting metric sebection influences, reduces the medical big data method for digging of misdetection rate/error rate and system.
In order to solve the above-mentioned technical problem, the technical solution adopted by the present invention are as follows:
Medical big data method for digging, including the following steps: acquire the medical data of patient, in which: the doctor of the patient Treating data includes: behavioral data, clinical data, cost data and insurance data;Knot is converted by the medical data of every patient Structure data;Establish the relevant database centered on patient;Database is cleaned, by Missing Data Filling or filter It removes;The data that will be cleaned are calculated based on interestingness measure standard not of the same race, obtain different entertaining rules;It utilizes Fuzzy C-Means Cluster Algorithm clusters different interest measures, the degree of membership of every kind of interest measure after being optimized.
Preferably, described to convert structural data for the medical data of every patient, it specifically includes: by every patient Medical data is divided into structural data and unstructured data;Structural data is converted by unstructured data.
Preferably, database is cleaned, by Missing Data Filling or filters out, specifically includes: calculated using linear difference Method or according to data distribution characteristics, is filled with one in mode, median, average value, maximum value, minimum value;Data It lacks serious, directly filters out.
Correspondingly, medical big data digging system, comprising: acquisition module, for acquiring the medical data of patient, in which: The medical data of the patient includes: behavioral data, clinical data, cost data and insurance data;Data conversion module is used for Structural data is converted by the medical data of every patient;Module is established, for establishing the relationship type number centered on patient According to library;Cleaning module by Missing Data Filling or is filtered out for cleaning to database;Extraction module, for that will clean Data, calculated based on interestingness measure standard not of the same race, obtain different entertaining rules;Module is integrated, for benefit Different interest measures is clustered with Fuzzy C-Means Cluster Algorithm, the degree of membership of every kind of interest measure after being optimized.
Preferably, the data conversion module includes: that the medical data of every patient is divided into structural data and non- Structural data;Structural data is converted by unstructured data.
Preferably, the cleaning module specifically includes: filling module, for using linear difference algorithm or according to data Distribution characteristics is filled with one in mode, median, average value, maximum value, minimum value;Module is filtered out, for data It lacks serious, directly filters out.
Compared with the prior art, the invention has the following beneficial effects:
The present invention is based on patient medical datas, by interest measure criterion calculation not of the same race, and utilize fuzzy C-means clustering Algorithm, the rate score of every kind of interest measure after being optimized calculate the contribution ranking behavior of each interest measure;Entire method Most suitable interestingness measure is synthesized or selects for specific medical data mining task based on data-driven, it can be effective The subjectivity for reducing interest measure selection influences, reduces misdetection rate/error rate.
Detailed description of the invention
The present invention will be further described in detail with reference to the accompanying drawing;
Fig. 1 is the flow diagram for the medical big data method for digging that the embodiment of the present invention one provides;
Fig. 2 is the structural schematic diagram for the medical big data digging system that the embodiment of the present invention one provides;
Fig. 3 is the structural schematic diagram of medical big data digging system provided by Embodiment 2 of the present invention;
Fig. 4 is the storage mode schematic diagram provided by Embodiment 2 of the present invention for establishing module;
In figure: 10 be acquisition module, and 20 be data conversion module, and 30 is establish module, and 40 be cleaning module, and 50 be extraction Module, 60 is integrate module, and 401 be filling module, and 402 be to filter out module.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is A part of the embodiments of the present invention, instead of all the embodiments;Based on the embodiments of the present invention, ordinary skill people Member's every other embodiment obtained without creative efforts, shall fall within the protection scope of the present invention.
Fig. 1 is the flow diagram for the medical big data method for digging that the embodiment of the present invention one provides, as shown in Figure 1, doctor Treat big data method for digging, including the following steps: acquire the medical data of patient, in which: the medical data of the patient includes: Behavioral data, clinical data, cost data and insurance data;Structural data is converted by the medical data of every patient;It builds The vertical relevant database centered on patient;Database is cleaned, by Missing Data Filling or is filtered out;By what is cleaned Data are calculated based on interestingness measure standard not of the same race, obtain different entertaining rules;Utilize fuzzy C-means clustering Algorithm clusters different interest measures, the degree of membership of every kind of interest measure after being optimized.
In the present embodiment one, the medical data of the acquisition patient by using different types of medical supply or can be System, such as: B ultrasound, CT, magnetic resonance, electrocardio, brain electricity, portable wearable device, hospital information system, acquisition magnanimity patient are related Information;Such as: by registering, questionnaire, obtaining the essential information of patient;By read in the modes such as case history obtain the medical of patient and Medication information;By connecting hospital information management system, cost and the insurance information of patient are obtained;Finally composition is with patient Each different block of informations at center.
Specifically, described to convert structural data for the medical data of every patient, it specifically includes: by every patient Medical data is divided into structural data and unstructured data;Structural data is converted by unstructured data;This implementation In example one, the patient medical data of acquisition include: structural data (such as: the essential information of patient, all kinds of clinical examination indexs Deng) and unstructured data (such as case history archive (e.g.xml), clinical medicine picture, the audit report of various text versions Deng);It needs to select specific information extraction mode to convert unstructured data for different unstructured data types For structural data;Such as: picture, the video in patient-related data are carried out structuring processing by the algorithm based on deep learning, The data of character express type are subjected to processing conversion using natural language processing technique.
Further, database is cleaned, by Missing Data Filling or filters out, specifically includes: utilizing linear difference Algorithm or according to data distribution characteristics, is filled with one in mode, median, average value, maximum value, minimum value;Number It is serious according to lacking, directly filter out.
In the present invention, by establishing the relevant database centered on patient, pass through the disease from hospital management system People's code of going to a doctor (can be used as the major key of one patient of identification, Primary key can each to record in unique identification database table Patient).
It influenced for the subjectivity that interest measure selects is effectively reduced, reduce misdetection rate/error rate as under different medical scene The medical related data of generation only cannot excavate dependency rule with single interest measure.
It indicates the medical related data occurred in set A with equation A → B below there is a strong possibility to appear in set B In;Here the part interest measure used in us is as shown in the following chart:
The interest-degree scale in part of the present invention of table 1
After the pairwise distance between two interest measures obtains, the opposite behavior of variety classes interest measure is this hair The emphasis of bright concern, the present invention in, select Fuzzy C-Means Cluster Algorithm different interest measures is clustered because its The degree of membership of generation can not only measure difference of the different interest measures between different clusters and can also measure even if same Difference in a cluster, between different interest measures.
The objective function of Fuzzy C-Means Cluster Algorithm is as follows:
Here N, c and m are the type of interest measure, the number and fuzzy factor of cluster respectively.xiAnd vjShow respectively The cluster centre of i kind interest measure and j-th of interest measure;Fuzzy C-Means Cluster Algorithm is substantially above-mentioned in order to minimize Objective function Q;It can be as follows by the method for continuous iteration:
Thus the degree of membership of every kind of interest measure after being optimized.
Degree of membership in the present invention reflect opposite behavior of different interest measures during rule association and they Between difference;Value based on degree of membership, comprehensive analysis difference interest measure is during medical big data association rule mining Role, to be directed to different medical care problems, selection or the more suitable interest measure mode of integration be can reduce The subjectivity of interest measure selection influences, and reduces misdetection rate/error rate.It is comprehensive to improve data, accuracy and processing data Efficiency.
Fig. 2 is the structural schematic diagram for the medical big data digging system that the embodiment of the present invention one provides, as shown in Fig. 2, doctor Treat big data digging system, comprising:
Acquisition module (10), for acquiring the medical data of patient, in which: the medical data of the patient includes: behavior Data, clinical data, cost data and insurance data;
Data conversion module (20), for converting structural data for the medical data of every patient;
Module (30) are established, for establishing the relevant database centered on patient;
Cleaning module (40) by Missing Data Filling or is filtered out for cleaning to database;
Extraction module (50), the data for that will clean are calculated based on interestingness measure standard not of the same race, are obtained Obtain different entertaining rules;
Module (60) are integrated, for being clustered using Fuzzy C-Means Cluster Algorithm to different interest measures, are obtained excellent The degree of membership of every kind of interest measure after change.
Specifically, the data conversion module (20) includes:
The medical data of every patient is divided into structural data and unstructured data;
Structural data is converted by unstructured data.
Fig. 3 is the structural schematic diagram of medical big data digging system provided by Embodiment 2 of the present invention, as shown in figure 3, On the basis of embodiment one, the cleaning module (40) is specifically included: filling module (401), for utilizing linear difference algorithm Or it according to data distribution characteristics, is filled with one in mode, median, average value, maximum value, minimum value;Filter out mould Block (402), for shortage of data it is serious, directly filter out.
Finally, it should be noted that the above embodiments are only used to illustrate the technical solution of the present invention., rather than its limitations;To the greatest extent Pipe present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that: its according to So be possible to modify the technical solutions described in the foregoing embodiments, or to some or all of the technical features into Row equivalent replacement;And these are modified or replaceed, various embodiments of the present invention technology that it does not separate the essence of the corresponding technical solution The range of scheme.

Claims (6)

1. medical big data method for digging, it is characterised in that: include the following steps:
Acquire the medical data of patient, in which: the medical data of the patient includes: behavioral data, clinical data, cost data And insurance data;
Structural data is converted by the medical data of every patient;
Establish the relevant database centered on patient;
Database is cleaned, by Missing Data Filling or is filtered out;
The data that will be cleaned are calculated based on interestingness measure standard not of the same race, obtain different entertaining rules;
Different interest measures is clustered using Fuzzy C-Means Cluster Algorithm, every kind of interest measure after being optimized Degree of membership.
2. medical treatment big data method for digging according to claim 1, it is characterised in that: the medical number by every patient According to structural data is converted into, specifically include:
The medical data of every patient is divided into structural data and unstructured data;
Structural data is converted by unstructured data.
3. medical treatment big data method for digging according to claim 1, it is characterised in that: clean, will lack to database Mistake value is filled or is filtered out, and is specifically included:
Using linear difference algorithm or according to data distribution characteristics, in mode, median, average value, maximum value, minimum value One fill;
Shortage of data is serious, directly filters out.
4. medical big data digging system, it is characterised in that: include:
Acquisition module (10), for acquiring the medical data of patient, in which: the medical data of the patient include: behavioral data, Clinical data, cost data and insurance data;
Data conversion module (20), for converting structural data for the medical data of every patient;
Module (30) are established, for establishing the relevant database centered on patient;
Cleaning module (40) by Missing Data Filling or is filtered out for cleaning to database;
Extraction module (50), the data for that will clean are calculated based on interestingness measure standard not of the same race, are obtained not Same entertaining rule;
Module (60) are integrated, for clustering using Fuzzy C-Means Cluster Algorithm to different interest measures, after obtaining optimization Every kind of interest measure degree of membership.
5. medical treatment big data digging system according to claim 4, it is characterised in that: data conversion module (20) packet It includes:
The medical data of every patient is divided into structural data and unstructured data;
Structural data is converted by unstructured data.
6. medical treatment big data digging system according to claim 4, it is characterised in that: the cleaning module (40) is specifically wrapped It includes:
It fills module (401), for mode, median, being averaged using linear difference algorithm or according to data distribution characteristics Value, maximum value, one in minimum value fill;
Filter out module (402), for shortage of data it is serious, directly filter out.
CN201810684758.XA 2018-06-28 2018-06-28 Medical big data method for digging and system Pending CN109087711A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810684758.XA CN109087711A (en) 2018-06-28 2018-06-28 Medical big data method for digging and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810684758.XA CN109087711A (en) 2018-06-28 2018-06-28 Medical big data method for digging and system

Publications (1)

Publication Number Publication Date
CN109087711A true CN109087711A (en) 2018-12-25

Family

ID=64840000

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810684758.XA Pending CN109087711A (en) 2018-06-28 2018-06-28 Medical big data method for digging and system

Country Status (1)

Country Link
CN (1) CN109087711A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113436027A (en) * 2021-06-30 2021-09-24 山大地纬软件股份有限公司 Medical insurance reimbursement abnormal data detection method and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101520878A (en) * 2009-04-03 2009-09-02 华为技术有限公司 Method, device and system for pushing advertisements to users
CN103678672A (en) * 2013-12-25 2014-03-26 北京中兴通软件科技股份有限公司 Method for recommending information
CN104268290A (en) * 2014-10-22 2015-01-07 武汉科技大学 Recommendation method based on user cluster
CN106055908A (en) * 2016-06-13 2016-10-26 武汉理工大学 Personal medical information recommending method and system based on cloud computation
CN106202891A (en) * 2016-06-30 2016-12-07 电子科技大学 A kind of big data digging method towards Evaluation of Medical Quality

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101520878A (en) * 2009-04-03 2009-09-02 华为技术有限公司 Method, device and system for pushing advertisements to users
CN103678672A (en) * 2013-12-25 2014-03-26 北京中兴通软件科技股份有限公司 Method for recommending information
CN104268290A (en) * 2014-10-22 2015-01-07 武汉科技大学 Recommendation method based on user cluster
CN106055908A (en) * 2016-06-13 2016-10-26 武汉理工大学 Personal medical information recommending method and system based on cloud computation
CN106202891A (en) * 2016-06-30 2016-12-07 电子科技大学 A kind of big data digging method towards Evaluation of Medical Quality

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
余辉: "医学知识获取与发现的研究", 《中国博士学位论文全文数据库》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113436027A (en) * 2021-06-30 2021-09-24 山大地纬软件股份有限公司 Medical insurance reimbursement abnormal data detection method and system

Similar Documents

Publication Publication Date Title
Egger et al. Medical deep learning—A systematic meta-review
Tsumoto et al. Similarity-based behavior and process mining of medical practices
CN107247881A (en) A kind of multi-modal intelligent analysis method and system
Saeed et al. An application of neutrosophic hypersoft mapping to diagnose brain tumor and propose appropriate treatment
Chen et al. Metafed: Federated learning among federations with cyclic knowledge distillation for personalized healthcare
CN110085325B (en) Method and device for constructing knowledge graph about traditional Chinese medicine experience data
Zhang et al. Medical diagnosis data mining based on improved Apriori algorithm
CN116759041B (en) Medical time sequence data generation method and device considering diagnosis and treatment event relationship
Gómez-Pulido et al. Predicting the appearance of hypotension during hemodialysis sessions using machine learning classifiers
Chou et al. Extracting drug utilization knowledge using self-organizing map and rough set theory
Yan et al. Left temporal pole contributes to creative thinking via an individual semantic network
CN106295092A (en) The multi-dimensional data of clinical treatment analyzes method and system
CN109087711A (en) Medical big data method for digging and system
Siu et al. An Intelligent clinical decision support system for assessing the needs of a long-term care plan
de Vries et al. Wearable-measured sleep and resting heart rate variability as an outcome of and predictor for subjective stress measures: A multiple N-of-1 observational study
Ma et al. Retrieval-based gradient boosting decision trees for disease risk assessment
Strobel et al. Healthcare in the Era of Digital twins: towards a Domain-Specific Taxonomy.
US20170039295A1 (en) Tribal abstraction network
Mitra et al. Diagnosing Alzheimer’s Disease Using Deep Learning Techniques
Tsumoto et al. Analytics for hospital management
Gulhane et al. Machine Learning Approach for Early Disease Prediction and Risk Analysis
CN117059231B (en) Method for machine learning of traditional Chinese medicine cases and intelligent diagnosis and treatment system
Wang et al. Enhancing Quality of Patients Care and Improving Patient Experience in China with Assistance of Artificial Intelligence
Mazid THE INTEGRATION OF BUSINESS INTELLIGENCE IN COMPANIES
Li et al. Comparative Study of Domestic and Foreign Health Care Platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20181225

RJ01 Rejection of invention patent application after publication