CN108573758B - Intelligent medical big data service system and application method - Google Patents

Intelligent medical big data service system and application method Download PDF

Info

Publication number
CN108573758B
CN108573758B CN201810386146.2A CN201810386146A CN108573758B CN 108573758 B CN108573758 B CN 108573758B CN 201810386146 A CN201810386146 A CN 201810386146A CN 108573758 B CN108573758 B CN 108573758B
Authority
CN
China
Prior art keywords
user
hospital
cloud platform
data
health data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810386146.2A
Other languages
Chinese (zh)
Other versions
CN108573758A (en
Inventor
刘宇红
周进凡
蒋明怀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guizhou University
Original Assignee
Guizhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guizhou University filed Critical Guizhou University
Priority to CN201810386146.2A priority Critical patent/CN108573758B/en
Publication of CN108573758A publication Critical patent/CN108573758A/en
Application granted granted Critical
Publication of CN108573758B publication Critical patent/CN108573758B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H80/00ICT specially adapted for facilitating communication between medical practitioners or patients, e.g. for collaborative diagnosis, therapy or health monitoring
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H40/00ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices
    • G16H40/20ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the management or administration of healthcare resources or facilities, e.g. managing hospital staff or surgery rooms
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Pathology (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention discloses an intelligent medical big data service system. The system comprises a user client, a hospital client and a server, wherein the user client is connected with the hospital client; the user client comprises a private cloud health data center for storing the disease information, diagnosis and treatment records, health data and a disease model of a user, and the private cloud health data center is connected with the data acquisition and monitoring unit; the hospital client comprises a hospital cloud platform, and the hospital cloud platform is connected with more than one doctor sub-platform; the hospital cloud platform is connected with a private cloud health data center of the user. The invention can realize medical resource sharing, reduce medical service expenditure, relieve hospital pressure, improve doctor diagnosis efficiency and reduce psychological burden of patients.

Description

Intelligent medical big data service system and application method
Technical Field
The invention relates to the field of medical information, in particular to an intelligent medical big data service system and an application method.
Background
After ten years of rapid development of medical and health information systems and biogenic technologies, the medical field has entered the "big data era" and data has been explosively increased. The digital research report provided by international data corporation shows that the amount of data created and copied globally has exceeded 1.8ZB in 2011 only, and if the data growth trend follows new moore's law, i.e., the global data doubles every two years, the data amount will be as high as 30ZB in 2020. Taking the prediction report of the mckentin as an example, the total data amount of the medical service place in 2012 is about 5000PB, and the storage of files such as medical image archiving, electronic medical record, medical research information, hospitalization records and the like stored in 2014 reaches nearly 10000 PB. However, the huge data usage rate is less than 20%, and a lot of data are stored in a data museum as a "cultural relic", which not only causes great trouble to the data storage of hospitals, but also wastes data, so that the data cannot play its role well.
Medical resources in the current society are not distributed evenly, high-quality medical service resources are concentrated in a first-line city, and medical resources in rural areas are seriously deficient. The lack of medical staff leads to that a doctor faces hundreds of patients every day, and the huge contradiction formed by the diagnosis and treatment demands of huge patients and the relative shortage of medical resources becomes the main factor of the contradiction between doctors and patients in the current society. According to statistics, in 2002-2012, the violence events for medical staff are increased by 23% each year on average, and 27 incidents of attacking the medical staff occur in each hospital on average each year. Even in the developed U.S. in the medical field, nearly half of the population receives inadequate treatment each year; over 200 million people suffer hospital infections; over 100 million people suffer disabling complications during surgery, and half of them are avoidable.
Aging of the population presents new challenges to the medical field. The estimated 2020 and 2050 years old people over 60 years old in China will reach 2.34 hundred million people and 4.37 hundred million people respectively, and the occupied population ratio will exceed 16% and 30% respectively; the medical expenses occupy a large proportion of the total social expenses due to chronic diseases caused by aging population and shortage of medical staff, and are increased year by year, and the medical expenses are expected to actually increase by about 5.2% in 2010-2030 years. According to the American college of medicine, 1/3, which is now a medical health expense, is wasted and not used to improve medical care. These wastes include unnecessary service, administrative waste, expensive medical expenses, opportunities for medical fraud and missed prevention.
Disclosure of Invention
The invention aims to provide an intelligent medical big data service system. The invention can realize medical resource sharing, reduce medical service expenditure, relieve hospital pressure, improve doctor diagnosis efficiency and reduce psychological burden of patients.
The technical scheme of the invention is as follows: an intelligent medical big data service system comprises a user client, wherein the user client is connected with a hospital client; the user client comprises a private cloud health data center for storing the disease information, diagnosis and treatment records, health data and a disease model of a user, and the private cloud health data center is connected with the data acquisition and monitoring unit; the hospital client comprises a hospital cloud platform, and the hospital cloud platform is connected with more than one doctor sub-platform; the hospital cloud platform is connected with a private cloud health data center of the user.
The intelligent medical big data service system further comprises a medicine enterprise cloud platform, and the medicine enterprise cloud platform is connected with the private cloud health data center of the user and the hospital cloud platform respectively.
According to the application method of the intelligent medical big data service system, a hospital cloud platform establishes a pathological model, and the pathological model is stored in a private cloud health data center of a user client; storing the ill information, diagnosis and treatment records of the user and the health data of the user acquired by the data acquisition and monitoring unit in real time into a private cloud health data center of the user, and matching the private cloud health data center of the user with a corresponding pathological model on a cloud platform according to the acquired health data to analyze the real-time health condition of the user; and when the analysis result is healthy, regularly feeding back health information to the user.
In the application method of the intelligent medical big data service system, when the analysis result is sick or possibly sick, the private cloud health data center diagnoses and provides treatment opinions, and if the treatment opinions are effective, the treatment results are fed back to the private cloud health data center of the user, and then the health of the user is continuously monitored in real time through the data acquisition monitoring unit.
In the application method of the intelligent medical big data service system, when the treatment opinions are invalid, the private cloud health data center transmits the pathological information obtained through diagnosis to a hospital cloud platform of a hospital client, the hospital cloud platform classifies diseases suffered by the user according to the pathological information and distributes the diseases to doctor sub-platforms of corresponding departments, a doctor analyzes the treatment scheme of the past cases or the patients who suffer from the diseases through the hospital cloud platform to assist the doctor in diagnosis and obtain the treatment opinions of the doctor, and then the hospital cloud platform transmits the treatment opinions of the doctor to the private cloud health data center of the user for treatment.
In the application method of the intelligent medical big data service system, the establishment of the pathology model is specifically that the acquired medical data is transmitted to a hospital cloud platform through a Flume tool or the medical data is stored in a manner of importing data in a database into the hospital cloud platform; medical data are processed through a MapReduce cluster-based high-performance parallel computing platform, then the processed medical data are analyzed through an association rule algorithm to find out the association among diseases, and a corresponding medical pathological model is established through a decision tree.
In the application method of the intelligent medical big data service system, the association rule algorithm is used for analyzing the processed medical data to find out the association between diseases, and the specific implementation process is that a frequent item set in the processed medical data is mined by using an FP-growth algorithm in the association rule algorithm, and after the frequent item set is obtained, the association rule is obtained according to the minimum support degree and the minimum confidence degree.
In the application method of the intelligent medical big data service system, the medicine enterprise cloud platform is used for acquiring partial data of the private cloud health data center of the user and medical data of the hospital cloud platform, and after the cloud platform analyzes the data, the medicine information is sent to the private cloud health data center of the user and the hospital cloud platform.
Advantageous effects
Compared with the prior art, the method and the system have the advantages that the health data of the user are monitored and collected in real time and transmitted to the private cloud health data center, and the health data of the user are analyzed through the private cloud health data center, so that the real-time monitoring on the health of the user is realized; when the analysis result of the user is sick, the private cloud health data center diagnoses and provides treatment opinions, and if the treatment opinions are effective, the treatment result is fed back to the private cloud health data center of the user; by the method, the diseases cured by the treatment opinions given by the private cloud health data center do not need to be treated in a hospital, so that the hospital pressure is effectively relieved, the medical service expenditure is reduced, and the diseases are more convenient to treat.
According to the invention, when the disease cannot be cured by the treatment suggestion given by the private cloud health data center, the pathological information obtained by diagnosis is transmitted to the hospital cloud platform of the hospital client by the private cloud health data center, the hospital cloud platform classifies the disease suffered by the user according to the pathological information and distributes the disease to the doctor sub-platform of the corresponding department, the doctor analyzes the treatment scheme of the previous case or the patient suffering from the disease through the hospital cloud platform to assist the doctor in diagnosing and obtaining the doctor treatment suggestion, and then the hospital cloud platform transmits the doctor treatment suggestion to the private cloud health data center of the user for the user to treat. By the method, on one hand, a doctor can know the state of the patient before receiving a consultation of the patient, and can obtain the treatment scheme auxiliary diagnosis of the previous case or the patient who has suffered from the disease, so that the diagnosis efficiency of the doctor is effectively improved, and the misdiagnosis rate is reduced; on the other hand, the patient can select a hospital which is more suitable for treating the disease to upload pathological information so as to obtain an optimal treatment scheme.
In conclusion, the medical resource sharing system can realize medical resource sharing, reduce medical service expenditure, relieve hospital pressure, improve doctor diagnosis efficiency, reduce psychological burden of patients (for many people, the psychological stress which may be caused by excessive depression of going to the hospital for medical treatment, especially for chronic disease patients), enable users to avoid preventable diseases as much as possible, and analyze disease data in real time to provide diagnosis and treatment suggestions in real time.
Drawings
FIG. 1 is a service flow diagram of the present invention;
FIG. 2 is a logic block diagram of the present invention;
FIG. 3 is a Flume delivery format;
FIG. 4 is an architectural diagram of MapReduce;
FIG. 5 is a FP tree holding compressed frequent pattern information;
FIG. 6 is a graph of strong associations between partial diseases;
fig. 7 is a schematic structural view of the present invention.
Detailed Description
The invention is further illustrated by the following figures and examples, which are not to be construed as limiting the invention.
Example 1. An intelligent medical big data service system is structurally shown in fig. 7 and comprises a user client, a hospital client and a server, wherein the user client is connected with the hospital client; the user client comprises a private cloud health data center for storing the disease information, diagnosis and treatment records, health data and a disease model of a user, and the private cloud health data center is connected with the data acquisition and monitoring unit; the hospital client comprises a hospital cloud platform, and the hospital cloud platform is connected with more than one doctor sub-platform; the hospital cloud platform is connected with a private cloud health data center of the user. The data acquisition monitoring unit can be a wearable intelligent device.
The intelligent medical big data service system further comprises a medicine enterprise cloud platform, and the medicine enterprise cloud platform is connected with the private cloud health data center of the user and the hospital cloud platform respectively.
The application method of the intelligent medical big data service system comprises the following steps: the hospital cloud platform analyzes the relevance among diseases through an association rule algorithm, establishes a pathological model by utilizing a decision tree, and stores the pathological model into a private cloud health data center of a user client; storing the ill information of the user, diagnosis and treatment records and health data of the user acquired by a data acquisition monitoring unit in real time into a private cloud health data center of the user, and matching the acquired health data with a corresponding pathological model on a cloud platform by the private cloud health data center of the user to analyze the real-time health condition of the user; and when the analysis result is healthy, regularly feeding back health information to the user.
When the analysis result is sick or possibly sick, the private cloud health data center diagnoses and provides treatment opinions, the treatment opinions are effective, the treatment result is fed back to the private cloud health data center of the user, and then the health of the user is continuously monitored in real time through the data acquisition monitoring unit.
When the treatment opinions are invalid, the private cloud health data center transmits the pathological information obtained through diagnosis to a hospital cloud platform of a hospital client, the hospital cloud platform classifies diseases suffered by the user according to the pathological information and distributes the diseases to doctor sub-platforms of corresponding departments, a doctor analyzes the treatment scheme of the past case or the patient who suffers from the diseases through the hospital cloud platform to assist the doctor in diagnosing and obtaining the treatment opinions of the doctor, and then the hospital cloud platform transmits the treatment opinions of the doctor to the private cloud health data center of the user for treatment.
The user can also perform self-check of the disease through the relevance of the own past case and disease and the relevance of the corresponding symptom and disease in the private cloud health data center.
The establishment of the pathological model specifically includes acquiring and transmitting acquired medical data to a hospital cloud platform through a Flume tool or storing the medical data in a manner of importing data in a database into the hospital cloud platform; medical data are processed through a MapReduce cluster-based high-performance parallel computing platform, namely a parallel computing distributed data processing method, then the processed medical data are analyzed through an association rule algorithm to find out the association among diseases, and a corresponding medical pathological model is established through a decision tree. The medical data includes medical data from wearable intelligent devices, electronic medical records, medical images, clinical examinations, medical literature, doctor-patient behavior, medical insurance industry, pharmaceutical industry, medical marketing enterprises, and the like.
For the medical data collection and transmission to the hospital cloud platform for storage through the flash, in the embodiment, the data is collected from the data source in fig. 2 through the flash, and the collected data is sent to the designated Sink (destination).
The flash is a high-availability, reliable and distributed mass log collection, aggregation and transmission system, and the design principle is also based on the collection and storage of data streams from various website servers into centralized memories such as HDFS, HBase and the like.
Events (events) are the most basic unit of the interior data transmission of the flash, and it is events that flow in the whole data transmission process. An event (event) consists of an array of bytes carrying data and an optional Header (Header).
The minimum independent operation unit of the flash is Agent, and the Agent itself is an independent daemon process JVM, which receives data from a client or other agents and then transmits the acquired data to the next destination node Sink or Agent. The Agent mainly comprises three components of Source, Channel and Sink, and the Agent completes the process of event flow from an external Source to a destination through the components for managing the Agent. The Source receives data from the data generator and passes the received data to one or more channels in the Flume's event format; the Channel is a short storage container, before being sent to the Sink, data in an event format received from Source is cached, and after the data really reaches the Sink and is consumed, the Flume deletes the cached data. The Channel can be linked to any number of sources and sinks. Sink stores data to pooled storage (e.g., HBase and HDFS), which consumes data (event) from the Channel and passes it to another Sink or HDFS or HBase. The transfer form of the flash is shown in fig. 3(a) and 3 (b).
And storing the acquired data through the path shown in fig. 3, and then processing the data through a MapReduce parallel computing-based distributed data processing method. MapReduce is a cluster-based high-performance parallel computing platform that allows commercially-available servers on the market to form a distributed and parallel computing cluster containing tens, hundreds, or thousands of nodes. The MapReduce cluster is of a master-slave structure, and comprises a control node and a plurality of working nodes. When the cluster runs, all the working nodes send heartbeat information to the control node periodically and report the current state of the node. After receiving the heartbeat information, the control node sends instruction information to the working node according to the current working condition and the self state of the working node. And the working node completes corresponding actions according to the received instruction information. In the MapReduce framework, the basic unit of data processing work performed by a user is a "job". In the MapReduce cluster, a "job" is divided into two phases for execution, Map and Reduce. At each stage, multiple tasks are executed in parallel. These tasks are distributed to a plurality of working nodes for execution, and the basic data processing work is completed. MapReduce adopts a Master/Slave (M/S) architecture, and mainly comprises the following components: client, JobTracker, TaskTracker, and Task. The architecture of MapReduce is shown in FIG. 4. A MapReduce program written by a user is submitted to a JobTracker end through a client; meanwhile, the user can check the operation state of the job through some interfaces provided by the Client. A MapReduce program may correspond to several jobs, and each job is decomposed into several Map/Reduce tasks. In the figure, the JobTracker is mainly responsible for resource monitoring and job scheduling. The JobTracker monitors the health conditions of all the TaskTracker and the operation, and once failure conditions are found, corresponding tasks are transferred to other nodes; meanwhile, the JobTracker tracks information such as execution progress and resource usage of the tasks and informs the information to the task scheduler, and the task scheduler selects a proper task to use the resources when the resources are idle. In Hadoop, a task scheduler is a pluggable module, and a user can design a corresponding scheduler according to the needs of the user. The tasktacker will periodically report the resource usage and task progress on this node to the JobTracker via Heartbeat, and at the same time, receive the command sent by the JobTracker to execute the corresponding operation (e.g. start a new task, kill a task, etc.). Tasktracker uses
And equally dividing the resource quantity on the node by the slot. "slot" represents a computing resource (CPU, memory, etc.). A Task has an opportunity to run after acquiring a slot, and the Hadoop scheduler is used for allocating the free slots on each TaskTracker to the tasks for use. slots are divided into Mapslot and Reducelot, which are used by MapTask and ReduceTask, respectively. The tasktacker defines the degree of concurrency of the Task by the number of slots (a configurable parameter). Task is divided into MapTask and ReduceTask, both of which are initiated by the TaskTracker. The processing unit of MapReduce is split. split is a logical concept that only contains some metadata information, such as the data start location, data length, the node where the data is located, etc. Its dividing method is defined by user himself. Each split is processed by one MapTask, the MapTask analyzes the split into key/value pairs in an iteration mode, a map () function defined by a user is called in sequence to process, and finally a result is stored on a local disk, wherein temporary data are divided into a plurality of partial (fragments), and each partial is processed by one ReduceTask.
In the foregoing, the association rule algorithm is used to analyze the processed medical data to find out the association between diseases, and the specific implementation process is to mine the frequent item set in the processed medical data by using the FP-growth algorithm in the association rule algorithm to obtain the frequent item set, and then obtain the association rule according to the minimum support and the minimum confidence.
To obtain the association rule, a frequent item set needs to be found first. The specific implementation process of mining the frequent item set by using the FP-growth algorithm is as follows:
the mining of the frequent item sets is to mine all the frequent item sets from a given data set through the comparison of the support of the item sets and a threshold value. If I is ═ I1,I2,…,ImIs a set of entries, given a transaction database D, where each transaction T is a non-empty subset of I, i.e. each transaction corresponds to a unique identifier tid (transaction id). Let A be a set of items, transaction T contain A, if and only if
Figure BDA0001642263090000091
The association rule is in the form of
Figure BDA0001642263090000092
In which
Figure BDA0001642263090000093
A ≠ φ, B ≠ φ, and A ≠ B ═ φ. Rules
Figure BDA0001642263090000094
Is established in transaction set D with a degree of support s, where s is the percentage of D that contains A ≦ B (i.e., the union of sets A and B), i.e., P (A ≦ B). Rules
Figure BDA0001642263090000095
There is a confidence level c in the transaction set D, where c is the percentage of transactions in D that contain a and also B, i.e. the conditional probability P (aji B). Namely:
Figure BDA0001642263090000096
Figure BDA0001642263090000097
once a frequent set of terms is found by the transactions in database D, they can be directly generated as strong association rules (strong association rules satisfy minimum support and minimum confidence), and the calculation of confidence can be obtained by the following formula:
Figure BDA0001642263090000101
a frequent item set mining method utilizes an FP-growth algorithm, supposing that the transaction data of certain diseases of a certain hospital are shown in a table 1, firstly, the transaction is scanned for the first time, a set of 1 item set is derived, and the support counts of the items are obtained, wherein the minimum support is set to be 2. The resulting set L { { I2: 7}, { I1: 6}, { I3: 6}, { I4: 2}, { I5: 2} }. The FP-tree is then constructed by first creating the root node of the tree, labeled null. The database D is scanned a second time. The items in each transaction are processed in L-order (i.e., sorted by decreasing support count) and a branch is created for each transaction. In general, when a transaction considers adding branches, the count along each node on a common prefix is increased by 1, creating nodes and links for items after the prefix. To facilitate traversal of the tree, an entry header table is created with each entry pointing to its location in the tree through a chain of nodes. The FP-tree is obtained after scanning all transactions (scanning the nth time to get the set of nth item sets), as shown in fig. 5. The resulting frequent patterns are shown in table 2. Assuming a minimum confidence threshold of 95%, then only { I%2,I5} (confidence 100%) and I1,I5And { I1,I2It is strongly correlated (with 100% confidence). By the method, the disease can be predicted according to symptoms, and the effect of treating the disease in advance before the disease appears is achieved. FIG. 6 is a graph of the relationship between some diseases mined by the FP-growth algorithm, in which two nodes are connected by an edge to represent that a strong relationship exists between two corresponding diseases.
TABLE 1
Figure BDA0001642263090000102
Figure BDA0001642263090000111
TABLE 2
Figure BDA0001642263090000112
In the embodiment, the decision tree in the classification algorithm is used for clinical disease auxiliary diagnosis, and the diagnosis rule is extracted from the clinical database, so that the diagnosis accuracy is improved. Decision trees are supervised machine learning, i.e. learning of the classifier is done under "supervision" that is informed to which class each training tuple belongs. The method is concretely realized as follows:
the structure of the decision tree adopts a C4.5 algorithm, and a node N is set to represent or store the tuple of the partition D. The split point is obtained from the information gain ratio.
The desired information needed to classify the tuples in D is given by:
Figure BDA0001642263090000113
wherein, PiIs that any tuple in D belongs to class CiNon-zero probability, and | Ci,DAnd I/D estimation. The base 2 logarithmic function is used because the information is binary coded. Info (D) is the average amount of information needed to identify the class label of the tuple in D, also called the entropy of D. Suppose a tuple in D is partitioned by some attribute A, where the attribute A has v different values { a } according to the training data observation1,a2,…,av}. If A is a discrete value, these values correspond to the v outputs tested on A. D can be divided into v partitions or subsets { D with attribute A1,D2,…,DvIn which D isjContaining tuples in D, their A value being aj. These partitions correspond to branches growing from node N. The expected information needed to classify the tuples divided by a into D is calculated by:
Figure BDA0001642263090000121
wherein the item
Figure BDA0001642263090000122
Serving as a weight for the jth partition. The smaller the desired information needed, the higher the degree of partitioning.
The information gain is defined as the difference between the original information requirement and the new information requirement. Namely:
Gain(A)=Info(D)-InfoA(D)
gain (a) tells us how much we get by partitioning on a.
The splitting information is defined as follows:
Figure BDA0001642263090000123
SplitInfoA(D) representing the information resulting from the division of the training data set D into v partitions of the corresponding attribute a test. The gain ratio is defined as:
Figure BDA0001642263090000124
the attribute of the maximum gain ratio is selected as the split attribute. And then, continuously repeating the process in the split child nodes to finally obtain the decision tree.
At decision tree creation time, many branches reflect anomalies in the training set due to noise and outliers in the data. By dealing with this over-fitting problem with pruning, C4.5 uses a pessimistic pruning method (PEP), with the following specific algorithm:
let e (t) be the error at t; i is a cover TtThe leaf of (1); n is a radical oftIs a subtree TtThe number of leaves of (a); n (t) is the number of training instances at node t.
Figure BDA0001642263090000131
Figure BDA0001642263090000132
Figure BDA0001642263090000133
If e '(T) ≦ e' (T)t)+Se(e′(Tt) If true), then TtShould be clipped.
In the pruning process of the PEP algorithm, each sub-tree in the tree needs to be visited at most once, and in the worst case, the computation time complexity of the PEP algorithm is only in linear relation with the number of non-leaf nodes of the non-pruned tree, so the PEP algorithm is considered as one of algorithms with higher precision in the current decision tree post-pruning method.
The medicine enterprise cloud platform is used for acquiring partial data of a private cloud health data center of a user (the acquisition needs to be approved by the user, and meanwhile personal information of the user needs to be concealed and then can be submitted to the medicine enterprise cloud platform) and medical data of a hospital cloud platform, and the medicine information is sent to the private cloud health data center of the user and the hospital cloud platform after the cloud platform analyzes the data. The medicine enterprises obtain the curative effect and the relevant information of the side effect of the medicine through the private cloud health data center of the user and the hospital cloud platform, adjust the medicine, simultaneously obtain the disease frequency of certain diseases and the gathering place of the patients of the diseases through the analysis of the data of the hospital cloud platform, and then carry out targeted medicine production and medicine market popularization. In addition, the medicine enterprise can provide corresponding medicine information for the hospital and the user, so that the medicine is popularized, and the intercommunication between the medicine enterprise and the hospital end and the user end is realized.

Claims (3)

1. An application method of an intelligent medical big data service system is characterized in that the intelligent medical big data service system comprises a user client, and the user client is connected with a hospital client; the user client comprises a private cloud health data center for storing the disease information, diagnosis and treatment records, health data and a disease model of a user, and the private cloud health data center is connected with the data acquisition and monitoring unit; the hospital client comprises a hospital cloud platform, and the hospital cloud platform is connected with more than one doctor sub-platform; the hospital cloud platform is connected with a private cloud health data center of the user; the system also comprises a medicine enterprise cloud platform which is respectively connected with the private cloud health data center of the user and the hospital cloud platform;
establishing a pathological model by a hospital cloud platform, and storing the pathological model into a private cloud health data center of a user client; storing the ill information, diagnosis and treatment records of the user and the health data of the user acquired by the data acquisition and monitoring unit in real time into a private cloud health data center of the user, and matching the private cloud health data center of the user with a corresponding pathological model on a cloud platform according to the acquired health data to analyze the real-time health condition of the user; when the analysis result is healthy, regularly feeding back health information to the user; when the analysis result is sick or possibly sick, the private cloud health data center diagnoses and provides treatment opinions, if the treatment opinions are effective, the treatment result is fed back to the private cloud health data center of the user, and then the health of the user is continuously monitored in real time through the data acquisition monitoring unit; when the treatment opinions are invalid, the private cloud health data center transmits the pathological information obtained through diagnosis to a hospital cloud platform of a hospital client, the hospital cloud platform classifies diseases suffered by the user according to the pathological information and distributes the diseases to doctor sub-platforms of corresponding departments, a doctor analyzes the treatment scheme of the past case or the patient who suffers from the diseases through the hospital cloud platform to assist the doctor in diagnosing and obtaining the treatment opinions of the doctor, and then the hospital cloud platform transmits the treatment opinions of the doctor to the private cloud health data center of the user for the user to treat;
the pathological model is specifically established by transmitting the acquired medical data to a hospital cloud platform through a Flume tool or storing the medical data in a manner of importing data in a database into the hospital cloud platform; medical data are processed through a MapReduce cluster-based high-performance parallel computing platform, then the processed medical data are analyzed through an association rule algorithm to find out the association among diseases, and a corresponding medical pathological model is established through a decision tree.
2. The application method of the intelligent medical big data service system according to claim 1, characterized in that: the method specifically realizes the process that a frequent item set in the processed medical data is mined by utilizing an FP-growth algorithm in an association rule algorithm, and after the frequent item set is obtained, the association rule is obtained according to the minimum support degree and the minimum confidence coefficient.
3. The application method of the intelligent medical big data service system according to claim 1, characterized in that: the medicine enterprise cloud platform is used for acquiring partial data of a private cloud health data center of a user and medical data of a hospital cloud platform, and sending medicine information to the private cloud health data center of the user and the hospital cloud platform after the cloud platform analyzes the data.
CN201810386146.2A 2018-04-26 2018-04-26 Intelligent medical big data service system and application method Active CN108573758B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810386146.2A CN108573758B (en) 2018-04-26 2018-04-26 Intelligent medical big data service system and application method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810386146.2A CN108573758B (en) 2018-04-26 2018-04-26 Intelligent medical big data service system and application method

Publications (2)

Publication Number Publication Date
CN108573758A CN108573758A (en) 2018-09-25
CN108573758B true CN108573758B (en) 2021-10-08

Family

ID=63575338

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810386146.2A Active CN108573758B (en) 2018-04-26 2018-04-26 Intelligent medical big data service system and application method

Country Status (1)

Country Link
CN (1) CN108573758B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110969534A (en) * 2018-09-30 2020-04-07 上海笛乐护斯健康科技有限公司 Intelligent health management and life safety insurance system and implementation method thereof
CN109411093B (en) * 2018-10-16 2022-03-18 国康中健(北京)健康科技有限公司 Intelligent medical big data analysis processing method based on cloud computing
CN109088782A (en) * 2018-11-01 2018-12-25 郑州云海信息技术有限公司 The log collecting method and device of distributed system
CN109599155A (en) * 2018-12-10 2019-04-09 上海新储集成电路有限公司 A kind of intelligent service system and method applied to medical data center
CN111312409B (en) * 2018-12-11 2023-11-10 康泰医学系统(秦皇岛)股份有限公司 Medical data sharing system and method
CN110164552A (en) * 2019-05-27 2019-08-23 苏州嘉展科技有限公司 A kind of system and method for extensive screening diabetic and tracking treatment effect
CN110336706B (en) * 2019-07-23 2022-09-13 中国工商银行股份有限公司 Network message transmission processing method and device
CN112309564A (en) * 2019-07-26 2021-02-02 深圳百诺明医说科技有限公司 Artificial intelligence diagnostic system and intelligent robot
CN111785372A (en) * 2020-05-14 2020-10-16 浙江知盛科技集团有限公司 Collaborative filtering disease prediction system based on association rule and electronic equipment thereof
CN117349030A (en) * 2023-12-04 2024-01-05 深圳本贸科技股份有限公司 Medical digital system, method and equipment based on cloud computing cluster

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105760705A (en) * 2016-05-20 2016-07-13 陕西科技大学 Medical diagnosis system based on big data
CN105808946A (en) * 2016-03-08 2016-07-27 贵州省邮电规划设计院有限公司 Remote mobile medical system based on cloud computing
CN106326623A (en) * 2015-07-06 2017-01-11 薛海强 Health information processing method and system
CN106951691A (en) * 2017-03-06 2017-07-14 宁波大学 Mobile telemedicine management method based on cloud platform
CN107633876A (en) * 2017-10-31 2018-01-26 郑宇� A kind of internet medical information processing system and method based on mobile platform

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106326623A (en) * 2015-07-06 2017-01-11 薛海强 Health information processing method and system
CN105808946A (en) * 2016-03-08 2016-07-27 贵州省邮电规划设计院有限公司 Remote mobile medical system based on cloud computing
CN105760705A (en) * 2016-05-20 2016-07-13 陕西科技大学 Medical diagnosis system based on big data
CN106951691A (en) * 2017-03-06 2017-07-14 宁波大学 Mobile telemedicine management method based on cloud platform
CN107633876A (en) * 2017-10-31 2018-01-26 郑宇� A kind of internet medical information processing system and method based on mobile platform

Also Published As

Publication number Publication date
CN108573758A (en) 2018-09-25

Similar Documents

Publication Publication Date Title
CN108573758B (en) Intelligent medical big data service system and application method
Eswari et al. Predictive methodology for diabetic data analysis in big data
Kumar et al. Big data analytics for healthcare industry: impact, applications, and tools
CN113707297B (en) Medical data processing method, device, equipment and storage medium
US10691646B2 (en) Split elimination in mapreduce systems
Armenatzoglou et al. Amazon Redshift re-invented
Małysiak-Mrozek et al. Soft and declarative fishing of information in big data lake
JP2017037648A (en) Hybrid data storage system, method, and program for storing hybrid data
CN107563153A (en) A kind of PacBio microarray dataset IT architectures based on Hadoop structures
US20150363467A1 (en) Performing an index operation in a mapreduce environment
Joy et al. Parallel frequent itemset mining with spark RDD framework for disease prediction
Cheng et al. Efficient event correlation over distributed systems
Tu et al. IoT streaming data integration from multiple sources
George et al. Performance comparison of apache hadoop and apache spark for covid-19 data sets
Sampath et al. Diabetic data analysis in healthcare using Hadoop architecture over big data
Sohail et al. Data mining techniques for Medical Growth: A Contribution of Researcher reviews
Doulkeridis et al. Parallel and distributed processing of spatial preference queries using keywords
Agapito et al. An efficient and scalable SPARK preprocessing methodology for Genome Wide Association Studies
Baby et al. Big data: an ultimate solution in health care
Hanmanthu et al. Parallel optimal grid-clustering algorithm exploration on mapreduce framework
Kavitha et al. Health Care Analytics with Hadoop Big Data Processing
Samra et al. Design of a clinical database to support research purposes: Challenges and solutions
Hasan et al. A knowledge graph approach for the secondary use of cancer registry data
Yang et al. A data anonymous method based on overlapping slicing
Ramkumar et al. Data analysis for chronic disease-diabetes using map reduce technique

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant