CN109671507A - A kind of obstetrics' disease that calls for specialized treatment coupling index method for digging based on Electronic Health Record - Google Patents

A kind of obstetrics' disease that calls for specialized treatment coupling index method for digging based on Electronic Health Record Download PDF

Info

Publication number
CN109671507A
CN109671507A CN201811578884.3A CN201811578884A CN109671507A CN 109671507 A CN109671507 A CN 109671507A CN 201811578884 A CN201811578884 A CN 201811578884A CN 109671507 A CN109671507 A CN 109671507A
Authority
CN
China
Prior art keywords
index
sample
disease
data
obstetrics
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811578884.3A
Other languages
Chinese (zh)
Inventor
李静
李光亚
张敬谊
姜峰
路平
卢鹏飞
丁偕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WANDA INFORMATION CO Ltd
Original Assignee
WANDA INFORMATION CO Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WANDA INFORMATION CO Ltd filed Critical WANDA INFORMATION CO Ltd
Priority to CN201811578884.3A priority Critical patent/CN109671507A/en
Publication of CN109671507A publication Critical patent/CN109671507A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Public Health (AREA)
  • Medical Informatics (AREA)
  • Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • General Health & Medical Sciences (AREA)
  • Epidemiology (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Pathology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The present invention relates to a kind of obstetrics' disease that calls for specialized treatment coupling index method for digging based on Electronic Health Record, step include building basic information database, disease data correlation, data prediction, sampling are divided to obtain sample set, building subcharacter screening washer, building combined sorting device and coupling index and examine.The present invention has the advantages that one, based on the electronic health record data of magnanimity, be associated index excavation, resource input is few;Two, by machine learning method, automated analysis is carried out to higher-dimension index, can quickly excavate more comprehensively coupling index set;Three, the coupling index for capableing of Establishing process for obstetrics' disease that calls for specialized treatment excavates system, is easy to promote in different medical mechanism.Therefore, method provided by the invention is the strong supplement of conventional method, and scientific research personnel on the one hand can be assisted to carry out Knowledge Discovery, save time and manpower;On the other hand foundation can be provided for the early prevention and treatment of obstetrics' disease that calls for specialized treatment.

Description

A kind of obstetrics' disease that calls for specialized treatment coupling index method for digging based on Electronic Health Record
Technical field
The present invention relates to a kind of obstetrics' disease that calls for specialized treatment coupling index method for digging based on Electronic Health Record, belongs to obstetrics' disease that calls for specialized treatment Methods of Knowledge Discovering Based technical field.
Background technique
In recent years, obstetric conditions patient morbidity in China's is gradually increasing, clinical visible more and more elderly parturient women, biochemical Gestation, spontaneous abortion, embryo stop the disease incidence such as development, birth defect, gestational diabetes, gestation hypertension in apparent Ascendant trend, this serious health for affecting pregnant woman interfere normal fetal development.Obstetrics' disease that calls for specialized treatment, especially gestational period glycosuria The individual difference of the diseases such as disease, gestation hypertension, diagnosis is big, and pathogenic factor is still not clear, and without effective cure method, and one Denier morbidity can carry out many adverse effects to pregnant woman and fetal zone.
To carry out clinical research and disease prevention and cure, means traditional at present are to excavate disease by experiment acquisition data Early stage coupling index, and the numerical value of coupling index is further analyzed and is probed into.However, this method is asked there are many Topic.Firstly, the difficulty of this method acquisition data is big and the period is long, the requirement to experimental resources is also higher;Secondly, experimental design is high Degree relies on the clinical experience of experimenter, therefore will appear the very different of experimental design, if experimental considerations mistake, not only results in The significant wastage of resource, it is also possible to draw the wrong conclusion, cause bigger harm;Again, it is limited by the limitation of resource, usually The variable that can be assessed in an experimental design procedure is limited, and the coupling index that can be excavated is also less.
In recent years, with medical treatment & health data acquisition and Electronic Health Record mechanism it is perfect, medical institutions store sea The obstetric conditions information of amount, this is that the coupling index of automation excavation obstetric conditions provides the foundation.
Summary of the invention
The purpose of the present invention is: interpretable coupling index set is provided for clinical obstetrics researcher, is realized to production The coupling index of section's disease that calls for specialized treatment carries out automatic excavating and analysis.
In order to achieve the above object, the technical solution of the present invention is to provide a kind of obstetrics based on Electronic Health Record are special Sick coupling index method for digging, which comprises the following steps:
Step 1, the electronic health record data for acquiring obstetrics pregnant woman construct obstetrics' disease that calls for specialized treatment basic information database;
Step 2 divides disease to carry out data correlation, including table structure association and patients' label, wherein table structure association Including being associated according to all tables of patient's unique identification to obstetrics' disease that calls for specialized treatment basic information database, each obstetrics' disease is constructed The disease information summary table of disease;Patients' label includes that will be produced according to the diagnostic message of disease information summary table to each disease Data sample in section's disease that calls for specialized treatment basic information database is divided into patient groups and non-diseased crowd, marks crowd's class label;
Step 3 pre-processes the sample data in obstetrics' disease that calls for specialized treatment basic information database, including feature extraction, number According to cleaning, index filtering and data normalization;
Step 4, a kind of disease for required analysis, construct illness sample set and non-diseased sample by the way of sampling Collection;
Step 5, the multiple subcharacter screening washers of building, including Fisher differentiation rate Feature Selection device, Pearson correlation coefficient Feature Selection device, cosine similarity Feature Selection device and apart from correlation coefficient eigenvalue screening washer, respectively count each index It calculates, in which:
If Fisher differentiation rate is FDR, then have:
In formula, μ1And μ2Respectively certain index sample average of patient groups and non-diseased crowd; σ1And σ2Respectively certain index sample standard deviation of patient groups and non-diseased crowd;
Certain index sample is x=(x1, x2..., xn), the sample label of index sample x is y=(y1, y2..., yn), then The Pearson correlation coefficient of index sample x is r:
In formula,For the mean value of index sample x,The mean value for being y for sample label;
If certain index sample is x and its sample label is y, the cosine similarity s between index sample x and sample label y:
If certain index sample is x and its sample label is y, the distance between index sample x and sample label y related coefficient For dcorr, then have:
It is respectively as follows:
It is respectively as follows:
It is respectively as follows:
Step 6, using the calculated value of multiple subcharacter screening washers in step 5 as the input of assemblage characteristic screening washer, make Combined sorting device is constructed with principal component comprehensive evaluation, calculates separately the principal component scores of each index, the value that keeps score is from small To big preceding p index;
Step 7 successively tests to the q index determined in step 6 using t method of inspection, and exports inspection result.
Preferably, in the step 1, the electronic health record data include pregnant woman's basic information data, physical examination data, It is admitted to hospital and records data, discharge record data, examines inspection data, ultrasound examination data, medical record diagnostic data, operation information data, disease Medical history and heredity medication history data.
Preferably, in the step 3, the sample data is text data, and the feature extraction refers to through canonical table The patient characteristic information in text data is extracted up to formula, and is converted into numeric type and enumeration type variable.
Preferably, in the step 3, data cleansing includes missing values processing and abnormality value removing, wherein missing values processing Mode include: mean value fill up, " -1 " value fill up and lack sample deletion;The mode of abnormality value removing include: sample delete and Pauta Criterion is rejected.
Preferably, in the step 3, index filtering includes that index filtering is rejected and lacked to unrelated index, wherein unrelated finger Mark, which is rejected, refers to the rejecting index unrelated with disease;Missing index filtering refers to when the missing ratio of the index is higher than a certain ratio When value β, the index is deleted.
Preferably, in the step 3, data normalization is to be converted using the estimated value of its mean value and variance to index A kind of method.
Preferably, in the step 6, the calculation method of principal component comprehensive evaluation the following steps are included:
Step 601 calculates covariance matrix: calculating covariance matrix Σ, Σ=(s of sample dataij)p×p, in which:
In formula: i, j=1,2 ..., p;xkiIndicate k-th of sample of i-th of index Notebook data;Indicate the mean value of all sample datas of i-th of index;xkjIndicate k-th of sample data of j-th of index; Indicate the mean value of all sample datas of j-th of index;
Step 602 determines principal component: finding out all characteristic values of covariance matrix Σ, i-th of covariance matrix Σ is special Value indicative is expressed as λi, take preceding m larger eigenvalue λs in all characteristic values1≥λ2≥…≥λm>=0 and with the m larger characteristic values Corresponding Orthogonal Units feature vector, i-th of Orthogonal Units feature vector are expressed as ei, m larger eigenvalue λs12,…,λm The variance of m principal component, e before respectively correspondingiCoefficient for i-th of principal component about former variable, the variance tribute of i-th of principal component Offering rate is αi, then have:
It selects principal component: obtaining m principal component, m is determined by contribution rate of accumulative total of variance G (m):
When contribution rate of accumulative total of variance G (m) is greater than preset threshold value, it is considered as to reflect the letter of primal variable enough Breath, corresponding m are the preceding m principal component extracted;
It calculates principal component scores: score of each index in m principal component is calculated, if i-th of index is in m principal component On be scored at Fi, then have:
Fi=e1ix1+e2ix2+…+epixp, in formula: i=1,2 ..., m.
Preferably, in the step 7, it is described examine the following steps are included:
Step 701, construction test statistics: patient groups' sample of certain index is set as xi, i=1,2 ..., M are non-diseased Crowd's sample is yi, i=1,2 ..., N construct inspection statistics valueIn formula,For the illness people of certain index Group's sample average;For non-diseased crowd's sample average of certain index;
Step 702, checking computation: setting significance value α, 0 < α < 0.1, calculating t are distributed in significance and are α, value t when freedom degree is M+N-2α(M+N-2), the inspection statistics value q of the index is calculated, if q value is located at section [- tα(M+N- 2),tα(M+N-2)] it except, then upchecks;Otherwise, it examines and does not pass through;
Step 702, output result: the index level of significance α, freedom degree M+N-2, inspection statistics value q and inspection are exported It tests and whether passes through.
The present invention has the advantages that one, based on the electronic health record data of magnanimity, be associated index excavation, Resource input is few;Two, by machine learning method, automated analysis is carried out to higher-dimension index, can quickly be excavated more Comprehensive coupling index set;Three, the coupling index for capableing of Establishing process for obstetrics' disease that calls for specialized treatment excavates system, is easy in difference Medical institutions promote.
Therefore, method provided by the invention is the strong supplement of conventional method, on the one hand scientific research personnel can be assisted to carry out Time and manpower are saved in Knowledge Discovery;On the other hand foundation can be provided for the early prevention and treatment of obstetrics' disease that calls for specialized treatment.
Detailed description of the invention
Fig. 1 is a kind of process of obstetrics' disease that calls for specialized treatment coupling index method for digging based on Electronic Health Record provided by the invention Figure.
Specific embodiment
Present invention will be further explained below with reference to specific examples.It should be understood that these embodiments are merely to illustrate the present invention Rather than it limits the scope of the invention.In addition, it should also be understood that, after reading the content taught by the present invention, those skilled in the art Member can make various changes or modifications the present invention, and such equivalent forms equally fall within the application the appended claims and limited Range.
In conjunction with Fig. 1, the present invention provides a kind of obstetrics' disease that calls for specialized treatment coupling index method for digging based on Electronic Health Record, packet Include following steps:
Step 1, the electronic health record data for acquiring obstetrics pregnant woman construct obstetrics' disease that calls for specialized treatment basic information database;Electronics Health account data include pregnant woman's basic information data, physical examination data, be admitted to hospital record data, discharge record data, examine check data, Ultrasound examination data, medical record diagnostic data, operation information data, history of disease and heredity medication history data.
Step 2 divides disease to carry out data correlation, including table structure association and patients' label, wherein table structure association Including being associated according to all tables of patient's unique identification to obstetrics' disease that calls for specialized treatment basic information database, each obstetrics' disease is constructed The disease information summary table of disease;Patients' label includes that will be produced according to the diagnostic message of disease information summary table to each disease Data sample in section's disease that calls for specialized treatment basic information database is divided into patient groups and non-diseased crowd, is marked respectively with number 1 and 0 Crowd's class label.
Step 3 pre-processes the sample data in obstetrics' disease that calls for specialized treatment basic information database, including feature extraction, number According to cleaning, index filtering and data normalization, in which:
Sample data is text data, and the feature extraction refers to the patient extracted in text data by regular expression Characteristic information, and it is converted into numeric type and enumeration type variable.
Data cleansing includes missing values processing and abnormality value removing, wherein the mode of missing values processing includes: that mean value is filled out It mends, sample deletion is filled up and lacked to " -1 " value;The mode of abnormality value removing includes: that sample is deleted and Pauta Criterion rejecting.
Index filtering includes that index filtering is rejected and lacked to unrelated index, wherein unrelated index rejecting refers to rejecting and disease The unrelated index of disease;Missing index filtering refers to when the missing ratio of the index is higher than a certain ratio value β, deletes the index.
Data normalization is a kind of method converted using the estimated value of its mean value and variance to index.Specifically, K-th of index has N number of sample data, i-th of sample data x of k-th of indexikNormalization data beCalculating Method are as follows:
In formula, xikIndicate i-th of sample data of k-th of index;
In the present embodiment, it is handled by step 3 and obtains the basis such as 24 indexs, including age, height, pregnant preceding BMI letter Breath, systolic pressure, diastolic pressure etc. check information.
Step 4, a kind of disease for required analysis, construct illness sample set and non-diseased sample by the way of sampling Collection.The preferential sample drawn Ji Ji from sample totality of the method using systematic sampling.It, can be with root other than using systematic sampling Simple random sampling, stratified sampling and chester sampling are automatically selected according to institute's study of disease to construct sample set.
Step 5, the multiple subcharacter screening washers of building, including Fisher differentiation rate Feature Selection device, Pearson correlation coefficient Feature Selection device, cosine similarity Feature Selection device and apart from correlation coefficient eigenvalue screening washer, respectively count each index It calculates, in which:
If Fisher differentiation rate is FDR, then have:
In formula, μ1And μ2Respectively certain index sample average of patient groups and non-diseased crowd; σ1And σ2Respectively certain index sample standard deviation of patient groups and non-diseased crowd;
Certain index sample is x=(x1, x2..., xn), the sample label of index sample x is y=(y1, y2..., yn), then The Pearson correlation coefficient of index sample x is r:
In formula,For the mean value of index sample x,The mean value for being y for sample label;
If certain index sample is x and its sample label is y, the cosine similarity s between index sample x and sample label y:
If certain index sample is x and its sample label is y, the distance between index sample x and sample label y related coefficient For dcorr, then have:
It is respectively as follows:
It is respectively as follows:
It is respectively as follows:
In the present embodiment, as a result as shown in table 1 below:
The calculated result of 14 seed characteristics screening washer of table
Step 6, using the calculated value of multiple subcharacter screening washers in step 5 as the input of assemblage characteristic screening washer, make Combined sorting device is constructed with principal component comprehensive evaluation, calculates separately the principal component scores of each index, the value that keeps score is from small To big preceding p index.
The calculation method of principal component comprehensive evaluation the following steps are included:
Step 601 calculates covariance matrix: calculating covariance matrix Σ, Σ=(s of sample dataij)p×p, in which:
In formula: i, j=1,2 ..., p;xkiIndicate k-th of sample of i-th of index Notebook data;Indicate the mean value of all sample datas of i-th of index;xkjIndicate k-th of sample data of j-th of index; Indicate the mean value of all sample datas of j-th of index;
Step 602 determines principal component: finding out all characteristic values of covariance matrix Σ, i-th of covariance matrix Σ is special Value indicative is expressed as λi, take preceding m larger eigenvalue λs in all characteristic values1≥λ2≥…≥λm>=0 and with the m larger characteristic values Corresponding Orthogonal Units feature vector, i-th of Orthogonal Units feature vector are expressed as ei, m larger eigenvalue λs12,…,λm The variance of m principal component, e before respectively correspondingiCoefficient for i-th of principal component about former variable, the variance tribute of i-th of principal component Offering rate is αi, then have:
It selects principal component: obtaining m principal component, m is determined by contribution rate of accumulative total of variance G (m):
When contribution rate of accumulative total of variance G (m) is greater than preset threshold value, it is considered as to reflect the letter of primal variable enough Breath, corresponding m are the preceding m principal component extracted;
It calculates principal component scores: score of each index in m principal component is calculated, if i-th of index is in m principal component On be scored at Fi, then have:
Fi=e1ix1+e2ix2+…+epixp, in formula: i=1,2 ..., m.
In the present embodiment, combined sorting device is constructed using principal component comprehensive evaluation, the principal component that index is calculated obtains Score value is as shown in table 2.Filter out maximum preceding 12 indexs of principal component score value ranking, including the age, TG, wbc, pregnant preceding BMI, HsCRP, pregestational weight, BMI classification, AST, ALT, pregnant time, systolic pressure, diastolic pressure, carry out the inspection of next step.
The principal component score value of 2 index of table
Step 7 successively tests to the q index determined in step 6 using t method of inspection, and exports inspection result.Institute State inspection the following steps are included:
Step 701, construction test statistics: patient groups' sample of certain index is set as xi, i=1,2 ..., M are non-diseased Crowd's sample is yi, i=1,2 ..., N construct inspection statistics valueIn formula,For the illness people of certain index Group's sample average;For non-diseased crowd's sample average of certain index;
Step 702, checking computation: setting significance value α, 0 < α < 0.1, calculating t are distributed in significance and are α, value t when freedom degree is M+N-2α(M+N-2), the inspection statistics value q of the index is calculated, if q value is located at section [- tα(M+N- 2),tα(M+N-2)] it except, then upchecks;Otherwise, it examines and does not pass through;
Step 702, output result: the index level of significance α, freedom degree M+N-2, inspection statistics value q and inspection are exported It tests and whether passes through.
In the present embodiment, significance is set as 0.05, using t method of inspection to 12 indexs determined in step 6 It tests, the results are shown in Table 3.Wherein, diastolic pressure index test does not pass through, other indexs pass through.Finally, output is examined As a result.
3 examination table of table
Through the above steps, obtain gestational diabetes coupling index include the age, TG, wbc, pregnant preceding BMI, hsCRP, Pregestational weight, BMI classification, AST, ALT, pregnant time and systolic pressure.The result and many literature research results are more consistent, China Inside and outside have multinomial correlative study surface age, pregnant preceding BMI, pregestational weight and have important shadow to gestational diabetes mellitus It rings.
The present invention provides a kind of obstetrics' disease that calls for specialized treatment coupling index method for digging based on Electronic Health Record.This method has Following advantage: firstly, based on the mass data of medical institutions, the coupling index of obstetrics' disease that calls for specialized treatment is dug in realization automatically Pick;Then, by machine learning method, quantitative analysis is carried out to higher-dimension index, can fast and accurately be excavated more complete The coupling index set in face;Finally, the procedure coupling index digging body for obstetrics' disease that calls for specialized treatment can be established using this method System is easy to promote in different medical mechanism.

Claims (8)

1. a kind of obstetrics' disease that calls for specialized treatment coupling index method for digging based on Electronic Health Record, which comprises the following steps:
Step 1, the electronic health record data for acquiring obstetrics pregnant woman construct obstetrics' disease that calls for specialized treatment basic information database;
Step 2 divides disease to carry out data correlation, including table structure association and patients' label, wherein table structure, which is associated with, includes It is associated according to all tables of patient's unique identification to obstetrics' disease that calls for specialized treatment basic information database, constructs each obstetric conditions Disease information summary table;Patients label include according to the diagnostic message of disease information summary table, it is to each disease that obstetrics are special Data sample in sick basic information database is divided into patient groups and non-diseased crowd, marks crowd's class label;
Step 3 pre-processes the sample data in obstetrics' disease that calls for specialized treatment basic information database, including feature extraction, data are clear It washes, index filters and data normalization;
Step 4, a kind of disease for required analysis, construct illness sample set and non-diseased sample set by the way of sampling;
Step 5, the multiple subcharacter screening washers of building, including Fisher differentiation rate Feature Selection device, Pearson correlation coefficient feature Screening washer, cosine similarity Feature Selection device and apart from correlation coefficient eigenvalue screening washer, respectively calculate each index, In:
If Fisher differentiation rate is FDR, then have:
In formula, μ1And μ2Respectively certain index sample average of patient groups and non-diseased crowd;σ1With σ2Respectively certain index sample standard deviation of patient groups and non-diseased crowd;
Certain index sample is x=(x1, x2..., xn), the sample label of index sample x is y=(y1, y2..., yn), then index The Pearson correlation coefficient of sample x is r:
In formula,For the mean value of index sample x,The mean value for being y for sample label;
If certain index sample is x and its sample label is y, the cosine similarity s between index sample x and sample label y:
If certain index sample is x and its sample label is y, the distance between index sample x and sample label y related coefficient are Dcorr then has:
It is respectively as follows:
It is respectively as follows:
It is respectively as follows:
Step 6, using the calculated value of multiple subcharacter screening washers in step 5 as the input of assemblage characteristic screening washer, use master Ingredient comprehensive evaluation constructs combined sorting device, calculates separately the principal component scores of each index, keeps score value from small to large Preceding p index;
Step 7 successively tests to the q index determined in step 6 using t method of inspection, and exports inspection result.
2. a kind of obstetrics' disease that calls for specialized treatment coupling index method for digging based on Electronic Health Record as described in claim 1, feature It is, in the step 1, the electronic health record data includes pregnant woman's basic information data, physical examination data, is admitted to hospital and records number According to, discharge record data, examines and check data, ultrasound examination data, medical record diagnostic data, operation information data, history of disease and something lost Pass history data.
3. a kind of obstetrics' disease that calls for specialized treatment coupling index method for digging based on Electronic Health Record as described in claim 1, feature It is, in the step 3, the sample data is text data, and the feature extraction, which refers to, extracts text by regular expression Patient characteristic information in notebook data, and it is converted into numeric type and enumeration type variable.
4. a kind of obstetrics' disease that calls for specialized treatment coupling index method for digging based on Electronic Health Record as described in claim 1, feature It is, in the step 3, data cleansing includes missing values processing and abnormality value removing, wherein the mode packet of missing values processing Include: mean value is filled up, " -1 " value is filled up and lacks sample deletion;The mode of abnormality value removing includes: that sample is deleted and La Yida standard Then method is rejected.
5. a kind of obstetrics' disease that calls for specialized treatment coupling index method for digging based on Electronic Health Record as described in claim 1, feature It is, in the step 3, index filtering includes that index filtering is rejected and lacked to unrelated index, wherein unrelated index rejecting refers to Reject the index unrelated with disease;Missing index filtering refers to when the missing ratio of the index is higher than a certain ratio value β, deletes The index.
6. a kind of obstetrics' disease that calls for specialized treatment coupling index method for digging based on Electronic Health Record as described in claim 1, feature It is, in the step 3, data normalization is a kind of side converted using the estimated value of its mean value and variance to index Method.
7. a kind of obstetrics' disease that calls for specialized treatment coupling index method for digging based on Electronic Health Record as described in claim 1, feature Be, in the step 6, the calculation method of principal component comprehensive evaluation the following steps are included:
Step 601 calculates covariance matrix: calculating covariance matrix Σ, Σ=(s of sample dataij)p×p, in which:
In formula: i, j=1,2 ..., p;xkiIndicate k-th of sample number of i-th of index According to;Indicate the mean value of all sample datas of i-th of index;xkjIndicate k-th of sample data of j-th of index;It indicates The mean value of all sample datas of j-th of index;
Step 602 determines principal component: finding out all characteristic values of covariance matrix Σ, the ith feature value of covariance matrix Σ It is expressed as λi, take preceding m larger eigenvalue λs in all characteristic values1≥λ2≥…≥λm>=0 and corresponding to the m larger characteristic values Orthogonal Units feature vector, i-th of Orthogonal Units feature vector be expressed as ei, m larger eigenvalue λs12,…,λmRespectively The variance of m principal component, e before correspondingiCoefficient for i-th of principal component about former variable, the variance contribution ratio of i-th of principal component For αi, then have:
It selects principal component: obtaining m principal component, m is determined by contribution rate of accumulative total of variance G (m):
When contribution rate of accumulative total of variance G (m) is greater than preset threshold value, it is considered as to reflect the information of primal variable enough, Corresponding m is the preceding m principal component extracted;
It calculates principal component scores: score of each index in m principal component is calculated, if i-th of index is in m principal component It is scored at Fi, then have:
Fi=e1ix1+e2ix2+…+epixp, in formula: i=1,2 ..., m.
8. a kind of obstetrics' disease that calls for specialized treatment coupling index method for digging based on Electronic Health Record as described in claim 1, feature Be, in the step 7, it is described examine the following steps are included:
Step 701, construction test statistics: patient groups' sample of certain index is set as xi, i=1,2 ..., M, non-diseased crowd Sample is yi, i=1,2 ..., N construct inspection statistics valueIn formula,For patient groups' sample of certain index This mean value;For non-diseased crowd's sample average of certain index;
Step 702, checking computation: setting significance value α, 0 < α < 0.1, calculating t and being distributed in significance is α, from Value t when by spending for M+N-2α(M+N-2), the inspection statistics value q of the index is calculated, if q value is located at section [- tα(M+N-2),tα (M+N-2)] it except, then upchecks;Otherwise, it examines and does not pass through;
Step 702, output result: exporting the index level of significance α, and freedom degree M+N-2, inspection statistics value q and inspection are It is no to pass through.
CN201811578884.3A 2018-12-24 2018-12-24 A kind of obstetrics' disease that calls for specialized treatment coupling index method for digging based on Electronic Health Record Pending CN109671507A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811578884.3A CN109671507A (en) 2018-12-24 2018-12-24 A kind of obstetrics' disease that calls for specialized treatment coupling index method for digging based on Electronic Health Record

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811578884.3A CN109671507A (en) 2018-12-24 2018-12-24 A kind of obstetrics' disease that calls for specialized treatment coupling index method for digging based on Electronic Health Record

Publications (1)

Publication Number Publication Date
CN109671507A true CN109671507A (en) 2019-04-23

Family

ID=66145898

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811578884.3A Pending CN109671507A (en) 2018-12-24 2018-12-24 A kind of obstetrics' disease that calls for specialized treatment coupling index method for digging based on Electronic Health Record

Country Status (1)

Country Link
CN (1) CN109671507A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110164559A (en) * 2019-04-28 2019-08-23 万达信息股份有限公司 A kind of lunger's early warning system based on electronic health record data
CN110631640A (en) * 2019-10-29 2019-12-31 云南师范大学 Intelligent monitoring system and method for storage tobacco leaf mildew state based on Internet of things
CN110674373A (en) * 2019-09-17 2020-01-10 上海森亿医疗科技有限公司 Big data processing method, device, equipment and storage medium based on sensitive data
CN111243753A (en) * 2020-02-27 2020-06-05 西安交通大学 Medical data-oriented multi-factor correlation interactive analysis method
CN111710411A (en) * 2020-05-29 2020-09-25 中润普达(十堰)大数据中心有限公司 Intelligent disease presumption system based on blood fat inspection indexes
CN111863240A (en) * 2020-07-08 2020-10-30 中润普达(十堰)大数据中心有限公司 Disease cognitive system based on abnormal change of human body fluid
CN112765144A (en) * 2021-01-22 2021-05-07 武汉大学 Method for checking and correcting conflict items after merging of health medical big data
CN112825275A (en) * 2019-11-21 2021-05-21 四川省人民医院 Method for predicting health state through physical examination indexes based on machine learning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107145715A (en) * 2017-04-12 2017-09-08 温州医科大学 A kind of clinical medical intelligent discriminating gear based on election algorithm
CN107301331A (en) * 2017-07-20 2017-10-27 北京大学 A kind of method for digging of the sickness influence factor based on microarray data
CN107463993A (en) * 2017-08-04 2017-12-12 贺志尧 Medium-and Long-Term Runoff Forecasting method based on mutual information core principle component analysis Elman networks
CN107680676A (en) * 2017-09-26 2018-02-09 电子科技大学 A kind of gestational diabetes Forecasting Methodology based on electronic health record data-driven

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107145715A (en) * 2017-04-12 2017-09-08 温州医科大学 A kind of clinical medical intelligent discriminating gear based on election algorithm
CN107301331A (en) * 2017-07-20 2017-10-27 北京大学 A kind of method for digging of the sickness influence factor based on microarray data
CN107463993A (en) * 2017-08-04 2017-12-12 贺志尧 Medium-and Long-Term Runoff Forecasting method based on mutual information core principle component analysis Elman networks
CN107680676A (en) * 2017-09-26 2018-02-09 电子科技大学 A kind of gestational diabetes Forecasting Methodology based on electronic health record data-driven

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
卢鹤鸣: "基于多期CT图像的常见肝脏疾病计算机辅助诊断系统", 《CNKI优秀硕士学位论文全文库》 *
卢鹤鸣: "基于多期CT图像的常见肝脏疾病计算机辅助诊断系统", 《CNKI优秀硕士学位论文全文库》, no. 08, 15 August 2014 (2014-08-15), pages 40 *
张涛等: ""基于特征筛选的模型选择"", 《广西科技大学学报》, vol. 27, no. 1, pages 163 - 165 *
王黎明等: ""基于距离相关系数和支持向量机回归的PM_(2.5)浓度滚动统计预报方案"", 《环境科学学报》, vol. 37, no. 4, pages 1269 - 1276 *
舒天然: "《我国中央银行流动性救助及其决策支持系统研究》", 31 August 2018, 西安交通大学出版社, pages: 60 - 61 *
郁飞: "《试验设计与数据处理》", 31 July 1999, 中国标准出版社, pages: 55 - 56 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110164559A (en) * 2019-04-28 2019-08-23 万达信息股份有限公司 A kind of lunger's early warning system based on electronic health record data
CN110674373A (en) * 2019-09-17 2020-01-10 上海森亿医疗科技有限公司 Big data processing method, device, equipment and storage medium based on sensitive data
CN110674373B (en) * 2019-09-17 2020-08-07 上海森亿医疗科技有限公司 Big data processing method, device, equipment and storage medium based on sensitive data
CN110631640A (en) * 2019-10-29 2019-12-31 云南师范大学 Intelligent monitoring system and method for storage tobacco leaf mildew state based on Internet of things
CN112825275A (en) * 2019-11-21 2021-05-21 四川省人民医院 Method for predicting health state through physical examination indexes based on machine learning
CN111243753A (en) * 2020-02-27 2020-06-05 西安交通大学 Medical data-oriented multi-factor correlation interactive analysis method
CN111243753B (en) * 2020-02-27 2024-04-02 西安交通大学 Multi-factor correlation interactive analysis method for medical data
CN111710411A (en) * 2020-05-29 2020-09-25 中润普达(十堰)大数据中心有限公司 Intelligent disease presumption system based on blood fat inspection indexes
CN111863240A (en) * 2020-07-08 2020-10-30 中润普达(十堰)大数据中心有限公司 Disease cognitive system based on abnormal change of human body fluid
CN112765144A (en) * 2021-01-22 2021-05-07 武汉大学 Method for checking and correcting conflict items after merging of health medical big data
CN112765144B (en) * 2021-01-22 2023-04-25 武汉大学 Method for checking and correcting conflict items after merging big health medical data

Similar Documents

Publication Publication Date Title
CN109671507A (en) A kind of obstetrics&#39; disease that calls for specialized treatment coupling index method for digging based on Electronic Health Record
CN109119167B (en) Sepsis mortality prediction system based on integrated model
Ahmed et al. An investigative study on motifs extracted features on real time big-data signals
CN109846472A (en) Beat classification method based on BiLSTM-Attention deep neural network
CN110148466A (en) A kind of heart impact signal atrial fibrillation computer aided diagnosing method based on transfer learning
CN111816321B (en) System, apparatus and storage medium for intelligent infectious disease identification based on legal diagnostic criteria
CN108511055A (en) Ventricular premature beat identifying system and method based on Multiple Classifier Fusion and diagnostic rule
CN107145715B (en) Clinical medicine intelligence discriminating gear based on electing algorithm
CN111968748A (en) Modeling method of diabetic complication prediction model
Shu et al. Clinical application of machine learning-based artificial intelligence in the diagnosis, prediction, and classification of cardiovascular diseases
CN109805924A (en) ECG&#39;s data compression method and cardiac arrhythmia detection system based on CNN
CN112101413A (en) Intelligent system for predicting cerebral apoplexy risk
CN113611419A (en) Postpartum hemorrhage risk prediction method and early warning system based on fetal monitoring uterine contraction diagram and high-risk factors
CN113593708A (en) Sepsis prognosis prediction method based on integrated learning algorithm
Zhang et al. Application of deep neural network for congestive heart failure detection using ECG signals
CN116864062B (en) Health physical examination report data analysis management system based on Internet
CN109157211A (en) A kind of portable cardiac on-line intelligence monitoring diagnosis system design method
CN116631558B (en) Construction method of medical detection project based on Internet
CN113539473A (en) Method and system for diagnosing brucellosis only by using blood routine test data
CN112002413A (en) Cardiovascular system infection intelligent cognitive system, equipment and storage medium
CN116564521A (en) Chronic disease risk assessment model establishment method, medium and system
CN117116477A (en) Construction method and system of prostate cancer disease risk prediction model based on random forest and XGBoost
CN115083616B (en) Chronic nephropathy subtype mining system based on self-supervision graph clustering
CN110739072A (en) Bleeding event occurrence evaluation method and system
CN110827275A (en) Liver nuclear magnetic artery phase image quality grading method based on raspberry group and deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information

Inventor after: Li Jing

Inventor after: Li Guangya

Inventor after: Zhang Jingyi

Inventor after: Jiang Feng

Inventor after: Lu Ping

Inventor after: Lu Pengfei

Inventor after: Ding Xie

Inventor before: Li Jing

Inventor before: Li Guangya

Inventor before: Zhang Jingyi

Inventor before: Jiang Feng

Inventor before: Lu Ping

Inventor before: Lu Pengfei

Inventor before: Ding Xie

CB03 Change of inventor or designer information