KR101799823B1 - Method of Normalization for Combination of Clinical Data from Different Electronic Healthcare Databases and System thereof - Google Patents
Method of Normalization for Combination of Clinical Data from Different Electronic Healthcare Databases and System thereof Download PDFInfo
- Publication number
- KR101799823B1 KR101799823B1 KR1020150113953A KR20150113953A KR101799823B1 KR 101799823 B1 KR101799823 B1 KR 101799823B1 KR 1020150113953 A KR1020150113953 A KR 1020150113953A KR 20150113953 A KR20150113953 A KR 20150113953A KR 101799823 B1 KR101799823 B1 KR 101799823B1
- Authority
- KR
- South Korea
- Prior art keywords
- data set
- normalization
- data
- normalization target
- database
- Prior art date
Links
Images
Classifications
-
- G06F19/3443—
-
- G06F19/32—
-
- G06F19/36—
Abstract
The present invention relates to a method and apparatus for analyzing and analyzing medical data collected from a plurality of medical institutions.
A method for normalizing a multicenter inspection data according to the present invention includes a subgroup dividing step of dividing a normalization target data set stored in a normalization target database to be normalized into at least two or more subgroups based on a predetermined characteristic index, A correction statistic for calculating a correction statistical index for normalizing the normalization target data set according to the reference data set using the statistical information of the normalization target data set and the reference data set stored in the reference database serving as a reference for performing the normalization, And a normalizing step of normalizing the inspection data values of the data samples included in the normalization target data set based on the reference data set using the correction statistical index.
According to the method and apparatus of the present invention, it is possible to eliminate the heterogeneity of numerical values of clinical test data existing between databases generated according to the numerical values of clinical test data measured at different medical institutions, So that it can be used as a single integrated analysis.
Description
The present invention relates to a method and system for analyzing and analyzing medical data collected from a plurality of medical institutions.
Distributed Research Networks, which allow mutual access to inspection data that each institution acquires and manages, is designed to normalize the inspection data acquired by a plurality of partner organizations and to share the analysis results, There is an advantage that inspection data can be obtained. For example, medical data obtained from a plurality of different hospitals from their patients can be shared among partner institutions through a distributed research network. However, the results of examinations or experiments obtained in different institutions, for example in different medical institutions, are obtained by measuring different equipment for different patients or groups of subjects, and thus are clinically or demographically heterogeneous There is a difficulty in analyzing and integrating it into one data set.
However, the existing normalization methods do not show satisfactory results when analyzing the inspection data acquired from the manifolds into a single data set. For example, in applying the above-described manifold inspection data to analysis, the existing rank-based method (prior art document 001) The conventional Z-score conversion method (prior art document 002) has a limitation in that it can not correct the heterogeneity between data merely by matching the mean and variance between different data groups .
In other words, the existing test data normalization or integrated analysis methods have a limitation in correctly integrating and analyzing test record data such as clinical test data and medical record data acquired and managed by various institutions.
Beasley TM, Erickson S and Allison DB Rank-based inverse normal transformations are increasingly used, but they are merited Behav Genet 2009; 39: 580-595. DOI: 10.1007 / s10519-009-9281-0
Cheadle C, Vawter MP, Freed WJ, et al. Analysis of microarray data using Z score transformation. J Mol Diagn 2003; 5: 73-81. DOI: 10.1016 / S1525-1578 (10) 60455-2
The present invention provides a method for normalizing and analyzing inspection data included in a database storing a medical data set acquired and stored and managed by patients from a plurality of medical institutions, and a system therefor will be.
The problem to be solved by the present invention is to overcome the clinicopathological or demographic structural heterogeneity existing in the test data obtained for different patients or subjects in the medical institutions and normalize and integrate the test data sets Method and a system therefor.
It is another object of the present invention to provide a method and system for analyzing a plurality of sets of medical record data while maintaining the confidentiality of patient information included in the medical record.
According to an aspect of the present invention, there is provided a method for normalizing a multicast inspection data according to one aspect of the present invention, the method comprising the steps of: normalizing a normalization target data set stored in a normalization target database to be normalized, A partial grouping step of dividing the partial grouping step; Calculating a correction statistical index for normalizing the normalization target data set according to the reference data set using the statistical information of the normalization target data set and the reference data set stored in the reference database serving as a reference for performing normalization, Statistical index calculation step; And a normalizing step of normalizing inspection data values of data samples included in the normalization target data set based on the reference data set using the correction statistical index.
Here, the reference database and the normalization target database may be connected to each other through a network.
Wherein the characteristic index includes a demographic index or a clinical index.
Wherein the reference database and the normalization target database are medical record databases for storing medical records and the inspection data values of the data samples included in the reference data set and the normalization target data set are the medical data values measured for the patient or the subject .
Here, the reference database and the normalization object database may include a plurality of electronic medical record systems connected to each other through a network, the distributed research network comprising: a database that operates in cooperation with the electronic medical record system; .
Here, the method may further include a database selection step of selecting the reference database or the normalization target database among at least two or more networks connected through a network.
Wherein the subgroup segmentation step may include a step of dividing the normalization target data set into a plurality of the subgroups based on the characteristic index, wherein the second processing unit is interlocked with the normalization target database.
Wherein the subgrouping step may further include the step of dividing the reference data set into a plurality of the subgroups based on the characteristic index, wherein the first processing unit is interlocked with the reference database.
Here, the correction statistical index calculating step may calculate the correction statistical index using information on the distribution of the number of the population of the subset of the normalization target data set.
Wherein the correction statistical index includes a corrected average of the reference data set and a corrected standard deviation calculated by correcting the reference data set using the information on the distribution of population numbers of the partial group of the normalization target data set .
Wherein the correction statistical index calculating step calculates the statistical information of the normalization target data set including information on the distribution of the number of the population of the normalization target data set, And calculating the correction statistical index using the statistical information of the reference data set including the information on the distribution of the number of the population of the reference data set.
Here, the correction statistical index calculation step may include: calculating a statistical information of the normalization target data set, the second processing unit being interlocked with the normalization target database; A first processing unit linked with the reference database, the method comprising: calculating statistical information of the reference data set; And the first processing unit may calculate the correction statistical index using the statistical information of the normalization target data set and the statistical information of the reference data set.
Wherein the statistical information of the reference data set includes an average of the inspection data values of the data samples of the reference data set, an average of the inspection data values of the data samples of the subset of the reference data set, Wherein the statistical information of the normalization target data set includes the number of data samples of the normalization target data set and the number of data samples of the subset group of the normalization target data set .
Wherein the normalizing step includes a step of calculating an inspection data value of a data sample included in the normalization target data set by using the corrected average and the corrected standard deviation of the reference data set and the inspection data of the data sample of the normalization target data set And normalization is performed using the average and standard deviation of the numerical values.
Here, the computer program according to another aspect of the present invention may be a computer program stored in a medium for executing the method of normalizing the multicenter inspection data in combination with a database.
According to another aspect of the present invention, there is provided a multi-pipe inspection data normalization system including a first processing unit interlocked with a reference database storing a reference data set serving as a reference for performing normalization; And a second processing unit operable to interoperate with a normalization target database that stores a normalization target data set to be subjected to normalization, wherein the first processing unit performs a correction for normalizing the normalization target data set according to the reference data set, Wherein the second processing unit divides the normalization target data set into at least two or more partial groups based on a predetermined characteristic index and sets the inspection data values of the data samples included in the normalization target data set as And normalizing the reference statistical data based on the reference data set using the correction statistical index.
Wherein the second processing unit comprises: a second subgroup dividing unit for dividing the normalization target data set into the subgroup based on the characteristic index; A second statistical information calculating unit for calculating statistical information of the normalization target data set including information on the distribution of the number of the population of the normalization target data set; And a normalization unit for normalizing inspection data values of data samples included in the normalization target data set based on the reference data set using the correction statistical index.
The first processing unit may include: a first subgroup dividing unit that divides the reference data set into the subgroup based on the characteristic index; A first statistical information calculating unit for calculating statistical information of the reference data set including information on the distribution of the number of the population of the reference data set; And a correction statistical index calculation unit for calculating the correction statistical index using the statistical information of the normalization target data set, the statistical information of the reference data set, and the inspection data values of the data samples of the reference data set .
According to the method and apparatus of the present invention, it is possible to eliminate the heterogeneity of numerical values of clinical test data existing between databases generated according to the numerical values of clinical test data measured at different medical institutions, So that it can be used as a single integrated analysis.
In particular, the method of normalizing the multicenter inspection data according to the present invention has the effect of normalizing and integrating and analyzing the medical data acquired from different data sources in the distributed research network environment.
FIG. 1 is a reference view showing a network system in which a method of normalizing manifest inspection data according to an embodiment of the present invention operates.
2 is a block diagram illustrating a system in which a method for normalizing manifest inspection data according to an embodiment of the present invention operates.
3 is a flow chart of a method for normalizing manifold inspection data according to an embodiment of the present invention.
4 is a flowchart of a method for normalizing manifold inspection data according to another embodiment of the present invention.
5 is a detailed flowchart of a method for normalizing the manifold inspection data according to the present invention.
FIG. 6 is a detailed block diagram of a first processing unit operating in conjunction with a reference database in the multicenter inspection data normalization system according to the present invention.
FIG. 7 is a detailed block diagram of a second processing unit operating in conjunction with a normalization target database in the multicenter inspection data normalization system according to the present invention.
FIG. 8 is a reference diagram showing an example of the operation of the correction statistical index calculation step according to the present invention.
9 is a reference diagram showing the heterogeneity of data existing between data sets obtained from different medical institutions.
FIG. 10 is a reference diagram for explaining the result of performing the group correction normalization method according to the present invention and the comparison result of the results according to the conventional method.
Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. In the drawings, the same reference numerals are used to designate the same or similar components throughout the drawings. In the following description of the present invention, a detailed description of known functions and configurations incorporated herein will be omitted when it may make the subject matter of the present invention rather unclear. In addition, the preferred embodiments of the present invention will be described below, but it is needless to say that the technical idea of the present invention is not limited thereto and can be variously modified by those skilled in the art.
First, the method for normalizing the manifold inspection data according to the present invention and the basic background in which the system is invented will be described in more detail.
Electronic Health Record System or Electronic Medical Record System, which acquires and manages medical records electronically, is widely used, and electronic medical records are networked among mutual partnerships. Researches are underway to integrate and analyze the data. Distributed research networks, which allow mutual access to the medical data that each institution acquires and manage, can collect more inspection data sets by collectively analyzing the inspection data acquired by a plurality of partner organizations There are advantages. It is preferable that a plurality of partner organizations are connected to each other in the network. For example, medical data obtained from a plurality of different hospitals from their patients can be shared among partner institutions through a distributed research network. In addition, the analysis data integrated through the distributed research network can be effectively used for retrospective study.
However, there are the following problems or limitations in integrating and analyzing electronic medical record systems of different medical institutions.
First, the test or experimental results obtained from different medical institutions are obtained by measuring with different equipment for different patients or groups of subjects. Thus, the test data obtained from each of these manifolds are clinically or demographically heterogeneous and difficult to integrate into one data set.
However, the existing normalization methods do not show satisfactory results when analyzing the inspection data acquired from the manifolds into a single data set. For example, in applying the above-described manifold inspection data to analysis, the existing rank-based method (prior art document 001) The conventional Z-score conversion method (Prior Art Document 002) is a method for simply matching the mean and variance between different data groups, There is a limitation in that it can not be corrected.
In other words, the existing test data normalization or integrated analysis methods can not be used to correctly integrate and analyze analysis record data such as clinical test data or medical record data that are acquired and managed by various institutions while maintaining the clinical meaning, .
Second, due to the nature of medical records, there is a problem of confidentiality of patient information. Therefore, there is a limitation that the medical records including the patient's personal information can not be simply analyzed and exchanged among the medical institutions.
In the present invention, in the distributed research networks, it is necessary to solve the heterogeneity between the medical data acquired at various medical institutions at the same time while ensuring confidentiality of the medical records as described above, and to perform normalization Method and a system therefor. The method proposed by the present invention is called a subgroup-adjusted normalization (SAN) method.
Hereinafter, the method of normalizing the multicenter inspection data according to the present invention and the operation of the system will be described in detail.
FIG. 1 is a reference view showing a network system in which a method of normalizing manifest inspection data according to an embodiment of the present invention operates.
A plurality of institutions can use databases respectively to store and manage the inspection data to be acquired. In the method for normalizing the multicenter inspection data according to the present invention, a
Next, the method for normalizing the multicenter inspection data according to the present invention can normalize the values of data samples included in the data set stored in the remaining databases on the basis of the selected
2 is a block diagram illustrating a system in which a method for normalizing manifest inspection data according to an embodiment of the present invention operates.
As shown in FIG. 1, in order to normalize the
3 is a flow chart of a method for normalizing manifold inspection data according to an embodiment of the present invention.
The method for normalizing the multicenter inspection data according to the present invention may include a partial grouping step S100, a correction statistical index calculating step S200, and a normalizing step S300. In addition, if necessary, the method for normalizing the manifold inspection data according to the present invention may further include a database selection step (S50). 4 is a flowchart of a method for normalizing the manifold inspection data of the embodiment in which the database selection step (S50) is further included.
In the partial grouping step S100, the normalization target data set stored in the
The correction statistical index calculation step S200 may calculate the normalization target data set as the reference data set using the statistical information of the normalization target data set and the reference data set stored in the
In the normalization step S300, the inspection data values of the data samples included in the normalization target data set are normalized based on the reference data set using the correction statistical index.
The database selection step S50, which may be further included as described above, selects the
Here, the
Wherein the data samples contained in the data set have predetermined test data values wherein the test data values can be various types of data values obtained by performing an inspection on a particular object. For example, the test data value may be a measured medical data value for a patient or a subject. Here, the medical data values may include various types of data obtained from a hospital, a medical institution, or a laboratory by examining and acquiring data for experimental purposes relating to a medical purpose or medical practice for a patient or subject. Also, the medical data herein may include any clinical examination data. For example, the medical data may be the body weight, height, size value of a particular body part, data about a particular component present in the liquid or tissue, including the blood obtained from the patient or subject, And may include any inspection data obtainable from other patients or subjects.
The
Hereinafter, the operation of the partial grouping step S100 will be described first.
In the partial grouping step S100, the normalization target data set stored in the
Wherein the characteristic index may be a Demographic Variable or a Clinical Variable. For example, the characteristic index may be a demographic indicator such as sex, racial residence or place of birth, or may be a clinical indicator such as an index indicating the severity of a specific disease. Hereinafter, for convenience of explanation, the characteristic indexes will be described according to examples of age and sex. However, it is to be understood that the characteristic index is not limited to the above example and may include demographic index or clinical index.
For example, in the partial grouping step S100, the normalization target data set can be divided into a plurality of subgroups based on age and sex. For example, age is divided into subgroups such as 0 to 10, 11 to 20, 21 to 30, 31 to 40, 41 to 50, 51 to 60, 61 to 70, 71 to 80, And divide the male and female gender into subgroups at the same time.
Also, the sub-group segmentation step S100 may divide the subset based on the specific index in the same manner as the reference data set.
5 is a detailed flowchart of a method for normalizing the manifold inspection data according to the present invention.
Here, the subgroup segmentation step S100 may include a step S120 of the
The partial grouping step S100 further includes a step S110 of the
Here, the normalized data set and the reference data set are all divided into the same number of subgroups based on the same characteristic index. The divided sub-group is used to calculate the correction statistical index by using the information on the distribution of the population numbers per sub-group in the correction statistical index calculation step (S200) to be described below.
Next, the operation of the correction statistical index calculation step S200 will be described.
The correction statistical index calculation step S200 calculates a correction statistical index for normalizing the normalization target data set according to the reference data set. Here, the correction statistical index may be calculated using the statistical information of the normalization target data set and the reference data set stored in the
To this end, the correction statistical index calculation step (S200) may calculate the statistical information of the normalization target data set including the information on the distribution of the population numbers of the subset of the normalization target data set. The correction statistical index calculation step (S200) may calculate the correction statistical index using the information on the distribution of the population numbers of the subset of the normalization target data set.
Here, the statistical information of the normalization target data set may include information on the distribution of the population numbers of the subset of the normalization target data set. More specifically, the statistical information of the normalization target data set includes the number of data samples of the normalization target data set (
), The number of data samples per subgroup of the normalization target data set ( , The number of data samples of the ith subset of the normalization target data set). That is, information on the distribution of the population numbers of the subset of the normalization target data set may be expressed by the number of data samples and the number of data samples per subset.Also, the correction statistical index calculation step (S200) may calculate the statistical information of the reference data set including the information on the distribution of the number of the population of the reference data set. The correction statistical index calculation step S200 may calculate the correction statistical index using the statistical information of the normalization target data set and the statistical information of the reference data set calculated as described above.
Here, the statistical information of the reference data set calculated in the correction statistical index calculating step (S200) may be calculated as an average of the inspection data values of the data samples of the reference data set
) And an average of the inspection data values of the sub-group data samples of the reference data set , The average of the inspection data values of the data samples of the ith subset of the reference data set) and the number of data samples per subset of the reference data set , The number of data samples of the ith subset of the reference data set).And the correction statistical indicator may include a corrected mean and a corrected standard deviation of the reference data set. Wherein the corrected average and corrected standard deviation of the reference data set are calculated by averaging and standard deviation of the reference data set using statistical information of the normalized data set and statistical information of the reference data set, It is deviation. To this end, the correction statistical index calculation step (S200) may calculate the corrected average and corrected standard deviation of the reference data set using the statistical information of the normalization target data set and the statistical information of the reference data set.
Here, the corrected average and the corrected standard deviation can be calculated by the following equations (1) and (2).
here
Wow Are the corrected mean and the corrected standard deviation of the reference data set, respectively, Is the inspection data value of the data samples of the ith subset of the reference data set, Is the average of the inspection data values of the data samples of the reference data set, Is the average of the inspection data values of the data samples of the ith subset of the reference data set, S is the number of said subset, Is the number of data samples of the normalization target data set, Is the number of data samples of the ith subset of the reference data set, Is the number of data samples of the ith subset of the normalization target data set.Hereinafter, the operation of the correction statistical index calculation step (S200) will be described with reference to FIG. 5 showing a detailed flowchart of the method of normalizing the manifold inspection data according to the present invention.
The correction statistical index calculation step S200 includes a step S220 of calculating the statistical information of the normalization target data set by the
Here, steps S210 and S220 may be performed first. In addition, the step S210 of the
The correction statistical index calculation step S200 may further include a step S225 of transmitting the statistical information of the normalization target data set calculated by the
The correction statistical index calculation step S200 may further include a step S235 of transmitting the correction statistical index calculated by the
In the case where the processing unit is divided into the
Next, the operation of the normalization step (S300) will be described.
In the normalization step S300, the inspection data values of the data samples included in the normalization target data set are normalized based on the reference data set using the correction statistical index. Here, the normalization operation as described above can be performed by the
In more detail, the normalizing step S300 includes comparing the inspection data values of the data samples included in the normalization target data set with the corrected average and the corrected standard deviation of the reference data set and the data samples of the normalization target data set Using the average and standard deviation of the test data values of the test data.
Here, the normalization step S300 may normalize the inspection data values of the data samples included in the normalization target data set according to the following equation (3).
here
Is the normalized result, Is an inspection data value of a data sample included in the normalization target data set, Wow Is an average and standard deviation of the inspection data values of the data samples of the normalization target data set, Wow Are the corrected mean and the corrected standard deviation of the reference data set, respectively.8 is a reference diagram showing an example of the operation of the correction statistical index calculation step S200 according to the present invention. As shown in FIG. 8, the correction statistical index calculation step S200 calculates a corrected statistical index of the reference data set (Dataset Da) to normalize the normalized data set (Dataset Db)
) And standard deviation ( ) Can be calculated. In Figure 8, mean is the average of the test data values of the data samples belonging to each subset (Subgroup (i)) , ), N is the number of data samples belonging to each subset, and the average of the squared differences from the total mean is And it means. Here, the respective variables are the same as the meanings of the variables in the above-mentioned equations (1) to (3). In this case, the corrected average of the reference data set ( ) And standard deviation ( Can be calculated according to Equations (1) and (2).
2 is a block diagram illustrating a multi-pipe inspection data normalization system in accordance with one embodiment of the present invention.
2, in order to normalize the
The
Here, the
The
FIG. 6 is a detailed block diagram of the
As shown in FIG. 6, the
The first
The first statistical
The correction
As shown in FIG. 7, the
Here, the second
The second statistical
The
The manifold inspection data normalization computer program according to another embodiment of the present invention may be a computer program stored in the medium for executing the method of normalizing the manifold inspection data in combination with the database.
Hereinafter, a method for normalizing the manifold inspection data according to the present invention and the performance and effects of the system will be described based on actual experimental results.
The partial population calibration normalization method according to the present invention is a method of normalizing a partial population correction normalized blood pressure data including blood test data (BUN), serum creatinine, hematocrit, hemoglobin, serum potassium, and total bilirubin obtained in two hospitals (A and B) , And it was confirmed that the heterogeneity between test data was effectively eliminated when compared with the existing methods as described below. The standardized difference in mean (SDM) and the Kolmogorov Smirnov Value (KS) were used for the performance test.
9 is a reference diagram showing the heterogeneity of data existing between data sets obtained from different medical institutions. As can be seen from FIGS. 9A to 9D, between the inspection data obtained in the two different hospitals A and B, the age (FIG. 9A and FIG. 9C) ) And gender view (FIGS. 9 (b) and 9 (d)) show the distribution characteristics of different hemoglobin.
The results of performing the group correction normalization method according to the present invention on the blood test data obtained from the two hospitals as described above and the conventional method of the Z-score transformation and the Rank-based inverse normal transformation (INT) The results obtained by performing the normalization by applying the respective methods are compared with each other in terms of the standardized difference in mean (SDM) and the Kolmogorov Smirnov value (KS, Kolmogorov Smirnov Value) (a) and (b).
The standardization method (Z-score transformation) is described in "Cheadle C, Vawter MP, Freed WJ, et al .: Analysis of microarray data using Z score transformation." J Mol Diagn 2003; 5: 73-81 DOI: 10.1016 / S1525 -1578 (10) 60455-2 ", and Rank-based inverse normal transformation (INT) method was used in" Bolstad BM, Irizarry RA, Astrand M, et al. A comparison of normalization methods for high density oligonucleotide DOI: ", Bioinformatics 2003; 19: 185-193.
Here, the standardized difference in means (SDM) is calculated as shown in Equation (5), and the Kolmogorov Smirnov Value (KS) can be calculated as Equation (6).
here
Is the mean of the m data set, Is the variance of the m data set.
Where sup is the supremum of the set of distances,
Wow Is the empirical distribution function of the first data set and the second data set, respectively.As can be seen in FIG. 10, the SAN cancellation method according to the present invention is superior to the conventional methods such as Z-score transformation method or Rank-based inverse normal transformation method INT It can be confirmed that the normalization has been performed. Here, RAW means raw data.
As described above, according to the method and apparatus for normalizing multi-organ examination data according to the present invention, it is possible to eliminate the heterogeneity of the values of clinical examination data existing between databases generated according to the numerical values of clinical examination data measured at different medical institutions, It is possible to efficiently use the integrated analysis as one.
In particular, the method of normalizing the multicenter inspection data according to the present invention has the effect of normalizing and integrating and analyzing the medical data acquired from different data sources in the distributed research network environment.
It is to be understood that the present invention is not limited to these embodiments, and all elements constituting the embodiment of the present invention described above are described as being combined or operated in one operation. That is, within the scope of the present invention, all of the components may be selectively coupled to one or more of them.
In addition, although all of the components may be implemented as one independent hardware, some or all of the components may be selectively combined to perform a part or all of the functions in one or a plurality of hardware. As shown in FIG. In addition, such a computer program may be stored in a computer readable medium such as a USB memory, a CD disk, a flash memory, etc., and read and executed by a computer to implement an embodiment of the present invention. As the recording medium of the computer program, a magnetic recording medium, an optical recording medium, or the like can be included.
Furthermore, all terms including technical or scientific terms have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs, unless otherwise defined in the Detailed Description. Commonly used terms, such as predefined terms, should be interpreted to be consistent with the contextual meanings of the related art, and are not to be construed as ideal or overly formal, unless expressly defined to the contrary.
It will be apparent to those skilled in the art that various modifications, substitutions and substitutions are possible, without departing from the scope and spirit of the invention as disclosed in the accompanying claims. will be. Therefore, the embodiments disclosed in the present invention and the accompanying drawings are intended to illustrate and not to limit the technical spirit of the present invention, and the scope of the technical idea of the present invention is not limited by these embodiments and the accompanying drawings . The scope of protection of the present invention should be construed according to the following claims, and all technical ideas within the scope of equivalents should be construed as falling within the scope of the present invention.
10: Reference database
20: Normalization target database
100:
200: second processing section
S50: Database selection step
S100: Subgrouping step
S200: Calculation statistical index calculation step
S300: Normalization step
Claims (20)
A subgroup dividing step of dividing the normalization target data set stored in the normalization target database to be normalized into at least two or more subgroups based on a predetermined characteristic index;
Calculating a correction statistical index for normalizing the normalization target data set according to the reference data set using the statistical information of the normalization target data set and the reference data set stored in the reference database serving as a reference for performing normalization, Statistical index calculation step; And
And normalizing the numerical values of the inspection data of the data samples included in the data set to be normalized based on the reference data set using the correction statistical index.
Wherein the reference database and the normalization object database are connected to each other through a network.
Wherein the characteristic index includes a demographic index or a clinical index.
Wherein the reference database and the normalization object database are medical record databases for storing medical records,
Wherein the inspection data values of the reference data set and the data samples included in the normalization target data set are medical data values measured for the patient or the subject.
Wherein the reference database and the normalization target database are databases in which a plurality of electronic medical record systems are interconnected via a network and which operate in cooperation with the electronic medical record system, Wherein the data is normalized.
Further comprising: a database selection step of selecting the reference database or the normalization target database among at least two or more networks connected through the network.
Wherein the subgrouping step comprises:
And a second processing unit operable to interoperate with the normalization target database, the method comprising: dividing the normalization target data set into a plurality of subgroups based on the characteristic index.
Wherein the subgrouping step comprises:
Further comprising the step of: dividing the reference data set into a plurality of the subgroups based on the characteristic index, wherein the first processing unit is interlocked with the reference database.
The correction statistical index calculating step may include:
Wherein the correction statistical index is calculated using information on the distribution of population numbers of the subgroups of the normalization target data set.
Wherein the correction statistical index includes a corrected average of the reference data set and a corrected standard deviation calculated by correcting the reference data set using information on the distribution of population numbers of the subset of the normalized data set Wherein the data is normalized.
The correction statistical index calculating step may include:
Calculating statistical information of the normalization target data set including information on the distribution of population numbers of the subset of the normalization target data set,
The correction statistical index is calculated by using the statistical information of the reference data set including the statistical information of the normalized data set and the information on the distribution of the number of the population of the reference data set by the partial data set calculated as described above A method of normalizing the data of the manifold inspection data.
The correction statistical index calculating step may include:
A second processing unit interlocked with the normalization target database, the method comprising: calculating statistical information of the normalization target data set;
A first processing unit linked with the reference database, the method comprising: calculating statistical information of the reference data set; And
Wherein the first processing unit includes a step of calculating the correction statistical index using the statistical information of the normalization target data set and the statistical information of the reference data set.
Wherein the statistical information of the reference data set includes an average of the inspection data values of the data samples of the reference data set, an average of the inspection data values of the data samples of the subset of the reference data set, Includes the population of star data samples,
Wherein the statistical information of the normalization target data set includes a number of data samples of the normalization target data set and a number of data samples of the subset of the normalization target data set.
Wherein the corrected average and the corrected standard deviation are calculated by the following Equations (1) and (2).
[Formula 1]
[Formula 2]
here Wow Are the corrected mean and the corrected standard deviation of the reference data set, respectively, Is the inspection data value of the data samples of the ith subset of the reference data set, Is the average of the inspection data values of the data samples of the reference data set, Is the average of the inspection data values of the data samples of the ith subset of the reference data set, S is the number of said subset, Is the number of data samples of the normalization target data set, Is the number of data samples of the ith subset of the reference data set, Is the number of data samples of the ith subset of the normalization target data set.
Wherein the number of the inspection data of the data samples included in the normalization target data set is set to the average and standard deviation of the corrected average value and the corrected standard deviation of the standard data set and the inspection data values of the data samples of the normalization target data set, And the normalization is performed using the normalized normalization method.
Wherein the normalizing step normalizes the inspection data values of the data samples included in the normalization target data set according to Equation (3).
[Formula 3]
here Is the normalized result, Is an inspection data value of a data sample included in the normalization target data set, Wow Is an average and standard deviation of the inspection data values of the data samples of the normalization target data set, Wow Are the corrected mean and the corrected standard deviation of the reference data set, respectively.
A first processing unit operable to interoperate with a reference database storing a reference data set as a reference for performing normalization; And
And a second processing unit operable to interoperate with a normalization target database storing a normalization target data set to be normalized,
Wherein the first processing unit calculates a correction statistical index for normalizing the normalization target data set according to the reference data set,
Wherein the second processing unit divides the normalization target data set into at least two or more partial groups based on a predetermined characteristic index and sets the inspection data values of the data samples included in the normalization target data set using the correction statistic index Wherein the normalization is performed on the basis of the reference data set.
A second subgroup division unit for dividing the normalization target data set into the subgroup based on the characteristic index;
A second statistical information calculating unit for calculating statistical information of the normalization target data set including information on the distribution of the number of the population of the normalization target data set; And
And a normalization unit for normalizing inspection data values of data samples included in the normalization target data set based on the reference data set using the correction statistical index.
A first subgroup dividing unit for dividing the reference data set into the subgroup based on the characteristic index;
A first statistical information calculating unit for calculating statistical information of the reference data set including information on the distribution of the number of the population of the reference data set; And
And a correction statistical index calculating unit for calculating the correction statistical index using the statistical information of the normalization target data set, the statistical information of the reference data set, and the inspection data values of the data samples of the reference data set. Manifold system for data of manifold inspection data.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020150113953A KR101799823B1 (en) | 2015-08-12 | 2015-08-12 | Method of Normalization for Combination of Clinical Data from Different Electronic Healthcare Databases and System thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020150113953A KR101799823B1 (en) | 2015-08-12 | 2015-08-12 | Method of Normalization for Combination of Clinical Data from Different Electronic Healthcare Databases and System thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
KR20170019739A KR20170019739A (en) | 2017-02-22 |
KR101799823B1 true KR101799823B1 (en) | 2017-11-21 |
Family
ID=58315160
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020150113953A KR101799823B1 (en) | 2015-08-12 | 2015-08-12 | Method of Normalization for Combination of Clinical Data from Different Electronic Healthcare Databases and System thereof |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR101799823B1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20200036296A (en) | 2018-09-28 | 2020-04-07 | 주식회사 어큐진 | Common data convert system for genome information |
KR20200036298A (en) | 2018-09-28 | 2020-04-07 | 주식회사 어큐진 | Common data convert method of genome information |
KR20220102166A (en) | 2021-01-11 | 2022-07-20 | 연세대학교 산학협력단 | A method for estimating a centralized model based on horizontal division without physical data sharing based on weighted integration |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102226899B1 (en) * | 2018-11-16 | 2021-03-11 | 주식회사 딥바이오 | Method and system for consensus diagnosis system based on supervised learning |
KR102571593B1 (en) * | 2021-04-07 | 2023-08-28 | 주식회사 에비드넷 | A method of constructing an interest pattern candidate database using medical data between medical institutions, and its devicee |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2004518187A (en) | 2000-10-11 | 2004-06-17 | ヘルストリオ,インコーポレイテッド | Health management data communication system |
-
2015
- 2015-08-12 KR KR1020150113953A patent/KR101799823B1/en not_active Application Discontinuation
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2004518187A (en) | 2000-10-11 | 2004-06-17 | ヘルストリオ,インコーポレイテッド | Health management data communication system |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20200036296A (en) | 2018-09-28 | 2020-04-07 | 주식회사 어큐진 | Common data convert system for genome information |
KR20200036298A (en) | 2018-09-28 | 2020-04-07 | 주식회사 어큐진 | Common data convert method of genome information |
KR20220102166A (en) | 2021-01-11 | 2022-07-20 | 연세대학교 산학협력단 | A method for estimating a centralized model based on horizontal division without physical data sharing based on weighted integration |
Also Published As
Publication number | Publication date |
---|---|
KR20170019739A (en) | 2017-02-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR101799823B1 (en) | Method of Normalization for Combination of Clinical Data from Different Electronic Healthcare Databases and System thereof | |
US11037070B2 (en) | Diagnostic test planning using machine learning techniques | |
JP5785184B2 (en) | Diagnostic techniques for continuous storage and integrated analysis of both medical and non-image medical data | |
EP2959414B1 (en) | Methods for indirect determination of reference intervals | |
EP1399868A2 (en) | Information processing method for disease stratification and assessment of disease progressing | |
Reynolds et al. | Association of time-varying blood pressure with chronic kidney disease progression in children | |
CN101061483A (en) | In-situ data collection architecture for computer-aided diagnosis | |
CN111048210A (en) | Method and device for evaluating disease risk based on fundus image | |
CN114023441A (en) | Severe AKI early risk assessment model and device based on interpretable machine learning model and development method thereof | |
Jiang et al. | Longitudinal analysis of change in mammographic density in each breast and its association with breast cancer risk | |
Haakma et al. | Belief elicitation to populate health economic models of medical diagnostic devices in development | |
Man et al. | Improving non-invasive hemoglobin measurement accuracy using nonparametric models | |
JP5799377B2 (en) | Abnormal frequency estimation device, abnormal frequency estimation program, abnormal frequency estimation method, and abnormal frequency estimation system | |
CN112967803A (en) | Early mortality prediction method and system for emergency patients based on integrated model | |
CN115691735B (en) | Multi-mode data management method and system based on slow-resistance pulmonary specialty data | |
JP7124265B2 (en) | Biomarker detection method, disease determination method, biomarker detection device, and biomarker detection program | |
JP5802614B2 (en) | CLINICAL INFORMATION DISPLAY DEVICE, CLINICAL INFORMATION DISPLAY DEVICE OPERATION METHOD, AND CLINICAL INFORMATION DISPLAY PROGRAM | |
EP4099335A1 (en) | System and method for estimation of delivery date of pregnant subject using microbiome data | |
Eadie et al. | Recommendations for research design and reporting in computer-assisted diagnosis to facilitate meta-analysis | |
Bahar et al. | Model Structure of Fetal Health Status Prediction | |
Kalra et al. | Online variational learning of finite inverted Beta‐Liouville mixture model for biomedical analysis | |
US20040030672A1 (en) | Dynamic health metric reporting method and system | |
Kim | Development and Validation of Traumatic Brain Injury Outcome Prognosis Model and Identification of Novel Quantitative Data-Driven Endotypes | |
CN110603592B (en) | Biomarker detection method, disease judgment method, biomarker detection device, and biomarker detection program | |
JP2014002498A (en) | Clinical information display device, operating method for the same, and clinical information display program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A201 | Request for examination | ||
E902 | Notification of reason for refusal |