CN112768058B - Method and device for processing medical data of metering information type - Google Patents

Method and device for processing medical data of metering information type Download PDF

Info

Publication number
CN112768058B
CN112768058B CN202110088239.9A CN202110088239A CN112768058B CN 112768058 B CN112768058 B CN 112768058B CN 202110088239 A CN202110088239 A CN 202110088239A CN 112768058 B CN112768058 B CN 112768058B
Authority
CN
China
Prior art keywords
data
medical
metering information
metering
index
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110088239.9A
Other languages
Chinese (zh)
Other versions
CN112768058A (en
Inventor
李红良
李浩淼
汪文鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN202110088239.9A priority Critical patent/CN112768058B/en
Publication of CN112768058A publication Critical patent/CN112768058A/en
Application granted granted Critical
Publication of CN112768058B publication Critical patent/CN112768058B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention discloses a method and a device for processing medical data of a metering information type, wherein the method comprises the following steps: acquiring metering information in a medical database, wherein the metering information is pure numerical data under each medical index in the medical database; cleaning and integrating abnormal data in the metering information; according to the legal metering range corresponding to the medical institution detection instrument, performing multi-unit characteristic processing on the medical index in the metering information; the method comprises the steps of independently extracting original metering information marked by the same unit characteristic and the same legal metering range under the same medical index to form an independent data set; carrying out grading information processing on the metering information of the independent data set; merging the metering information and correcting the conflict; and carrying out statistical analysis on the corrected metering information to obtain the managed metering information. The invention can improve the treatment capability and the utilization rate of medical data.

Description

Method and device for processing medical data of metering information type
Technical Field
The invention relates to the technical field of medical information, in particular to a method and a device for processing medical data of a metering information type.
Background
With the continuous development of information technology, the degree of informatization of hospitals is gradually increased, the range and the scale of medical data are also larger and larger, and how to effectively extract, store and utilize the medical data becomes an increasingly important problem.
At present, medical systems used in each department in a hospital run independently, respectively manage medical data of patients in each department, and meanwhile, the architecture, data format and coding standard of each medical system are possibly different, so that the medical data of each medical system in the hospital cannot be integrated, the structuralization of the medical data is realized, and the utilization rate of the medical data is greatly reduced. Although some clinical data centers have certain data management capabilities at present, professional processing of medical data is still lacking. Therefore, under such a large background, it is urgently needed to improve the capability of classifying, processing and processing medical data so as to accurately acquire effective medical data and improve the utilization rate of the medical data.
Disclosure of Invention
The technical problem to be solved by the present invention is to provide a method and an apparatus for processing medical data of a metering information type, aiming at the defects in the prior art.
The technical scheme adopted by the invention for solving the technical problem is as follows:
the invention provides a method for processing medical data of a metering information type, which comprises the following steps:
acquiring metering information in a medical database, wherein the metering information is pure numerical data under each medical index in the medical database;
cleaning and integrating abnormal data in the metering information;
according to the legal metering range corresponding to the medical institution detection instrument, performing multi-unit characteristic processing on the medical index in the metering information;
the method comprises the steps of independently extracting original metering information marked by the same unit characteristic and the same legal metering range under the same medical index to form an independent data set;
carrying out grading information processing on the metering information of the independent data set;
merging the metering information and correcting the conflict;
and carrying out statistical analysis on the corrected metering information to obtain the managed metering information.
Further, the method for acquiring the metering information in the medical database comprises the following specific steps:
adding a characteristic mark to the metering information collected from the cooperative medical institution, and extracting the metering information from the medical database according to the characteristic mark to form an independent metering information database.
Further, the specific method for adding the feature mark to the metering information in the method of the present invention is as follows: in the process of column name standardization, a characteristic mark is added to each column of metering information for marking the data type of the column.
Further, the abnormal data in the method of the present invention specifically includes:
the abnormal data is non-pure numerical type data in the medical database, and comprises text information of pure text, grade information of pure grade, illegal information without specific meaning, and mixed information of any one or more types of information and numerical type information.
Further, the method of the present invention performs multi-unit feature processing on the index in the measurement information, and the specific method is as follows:
according to the fact that the same medical index has different legal metering ranges, format standardization of multiple legal metering ranges is conducted on the same medical index;
adding corresponding hierarchical characteristics to multiple legal metering ranges according to different medical meanings represented by metering information under the medical index distributed in the legal metering ranges;
the metering information marked as the same legal metering range forms the same unit characteristic, and the unit characteristic is added to the corresponding medical index, so that the same medical index has a plurality of unit characteristics; the specific algorithm comprises the following steps:
1) Standardizing physical examination data index names of different sources according to the international term set, establishing a user-defined standard term set on the basis of the standardization and the establishment of a standard distribution database of the metering information indexes by using the metering information data cleaned at the early stage;
2) Processing the data to be cleaned by an algorithm to obtain a data list with a non-pure numerical form, performing regular matching by the algorithm to correct an illegal numerical form to obtain pure metering information data, and meanwhile, performing algorithm logic relation judgment according to a legal range given by a user-defined standard glossary, clearing the contents smaller than a lower limit of a reference value and larger than an upper limit of the reference value in a metering index to obtain the data of the metering information in the legal range;
3) Similarity comparison is carried out on medical measurement data to be confirmed and a medical standard term distribution database, a whole column of data under the same index of the data of the same institution is extracted according to a corresponding medical reference range in a reference range data table given by an original institution, the extracted data is compared with the data of a corresponding standard distribution database in a standard term library, relevant parameters of the extracted data and the data of the standard term library are counted, a relevant coefficient is made to be r, median of the extracted data and the data of the standard term library are respectively m 1 And m 2 A is a in each of the quartiles 1 And a 2 Three quarters each being b 1 And b 2 Calculating a weight value through the statistical related parameters; the calculation method of the weight value w is as follows: w = r 10- (m) 1 -m 2 )*3-(a 1 -a 2 )/a 1 *3-(b 1 -b 2 )/b 1 *3; then, displaying and comparing a histogram of data quantity total quantity morphological frequency statistics, a box line graph of data distribution and a density distribution graph of data, and recommending index names in the most similar standard term library by an algorithm according to weight values; if the indexes to be cleaned do not exist in the existing standard distribution database, only performing distribution display of the indexes, calculating relevant parameters, generating a box line graph and a density distribution graph of the indexes, and forming a correlation statistical result and a distribution graph;
4) After obtaining a correlation statistical result and a distribution diagram generated for index values under the same unit under each index, recommending a final metering data index name and a corresponding correct reference range according to a weight value and a distribution form, realizing the standardization of the metering index to be cleaned through an algorithm, converting the medical reference range into data of a corresponding grade form with large granularity, and converting the data of the index into a graded form according to a conversion rule: 1 represents low, 2 represents normal, and 3 represents high, and is used for subsequent data cleaning;
5) And merging the cleaned metering information data according to the same term column, and performing distribution display and system error quality inspection.
Further, the method of the present invention for performing multi-unit feature processing on medical indicators in metering information further includes a method for generating a class a mapping table:
according to the normal value range corresponding to the index generated by a certain medical index provided by a medical institution under different detection methods and different detection batches, a rule table, namely an A-type mapping table, corresponding to the numerical value of each metering information and used for judging the medical significance of the metering information is formed so as to mark the medical significance behind the numerical value of each metering information; and according to the A-type mapping table, independently extracting the numerical value marked in the same normal value range under the medical index.
Further, the method of the present invention performs a grading information processing on the metering information, and the specific method thereof is as follows:
and converting the original metering information into corresponding graded information according to the graded characteristics, then combining the graded information generated by each independent data set, and finally converting all the metering information of the same unit characteristic under the same medical index into the graded information.
Further, the method of the present invention for performing merging conflict correction on metering information specifically comprises:
merging the original metering information in the independent data sets with the same medical index and the same unit characteristic, marking the metering information corresponding to two or more same medical indexes and the same unit characteristic of the same patient as a merging conflict, and finally selecting the only and correct metering information from the merging conflict.
Further, the statistical analysis of the corrected metering information in the method of the present invention specifically includes:
and (4) carrying out system error check on the corrected metering information and the metering information with the same index and the same unit characteristic, which is collected and cleaned by other cooperative medical institutions, marking unqualified metering information according to the consistency definition of the statistical field, and further correcting the marked metering information to obtain the managed metering information after the qualified metering information is confirmed.
The invention provides a processing device for medical data of a metering information type, which comprises the following modules:
the reading module is used for reading metering information with characteristic marks from the collected medical data of the mixed data type;
the abnormal data cleaning module is used for extracting abnormal data from the metering information with the characteristic mark and clearing the abnormal data;
the multi-unit characteristic processing module is used for independently extracting the original metering information marked in the same legal metering range to enable the extracted original metering information to form an independent data set;
the conversion module is used for converting the original metering information in the independent data set into corresponding hierarchical information;
the merging module is used for merging the hierarchical information generated by the independent data sets and merging and marking merging conflicts on the metering information generated by the independent data sets;
and the statistical analysis module is used for checking the error of the corrected metering information system and marking the unqualified metering information.
The invention has the following beneficial effects: the processing method and the processing device for the medical data of the metering information type can enhance the treatment capability of the clinical data, can integrate the medical data of patients in each department of a hospital and realize the structuralization of the medical data; the invention can improve the capability of classifying, processing and processing the medical data, can accurately acquire effective medical data and greatly improves the utilization rate of the medical data.
Drawings
The invention will be further described with reference to the accompanying drawings and examples, in which:
FIG. 1 illustrates the data status, corresponding data type and medical significance of a medical index in a raw medical database according to an embodiment of the present invention;
FIG. 2 is a schematic illustration of the purging of metering type information in accordance with an embodiment of the present invention;
FIG. 3 is a general process flow diagram of metering type information according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
As shown in fig. 1 and 2, a method for processing medical data of a metering information type according to an embodiment of the present invention includes the steps of:
marking pure numerical data under each medical index in an original medical database as metering information type data, and dividing other types of data into other types of data according to data definition rules, such as text information of a pure text, grade information of a pure grade and meaningless illegal information such as undetected information, not done information and the like;
removing other information types such as mixed grade information, text information and illegal information in the metering information according to the classification principle, simultaneously enabling the other information types to enter a normalized data processing flow corresponding to the metering information, and forming a second metering information data set by the reserved pure metering type information so as to improve the cleaning accuracy of the metering information type data and increase the utilization rate of other information type data;
according to the normal value range corresponding to the index generated by a certain medical index provided by a cooperative medical institution under different detection methods and different detection batches, a rule table, namely an A-type mapping table, corresponding to each numerical value and used for judging the medical significance of each numerical value is formed so as to mark the medical significance behind each numerical value, the reading of the metering information is enhanced, meanwhile, the subsequent normalization processing is facilitated according to the medical significance, the isolation of data caused by the fact that the unit of the data and the normal value range are different is avoided, and the utilization rate of the type data of the metering information is further improved.
According to the A-type mapping table, the numerical values marked in the same normal value range under the medical index are independently extracted to form a third metering information data set;
converting each numerical value in the third metering information data set into grade information with the same medical significance according to the corresponding medical significance of each numerical value in the third metering information data set to form a fourth independent data set, so that the fourth independent data set can be conveniently merged later; as shown in fig. 3, the specific algorithm involved therein is:
1. referring to an international term set MedDRA, standardizing physical examination data index names of different sources by the MedDRA, and establishing a user-defined term set (the standard term library mainly comprises standard terms, term classifications, a term grading standard, legal values of the terms, upper lines and lower lines of the reference values and term reference data) on the basis of the standard terms, wherein the standard term library establishes 50 categories and 66 detailed classifications, and 2226 measurement standard terms are counted), and a standard distribution database of measurement data indexes is constructed by using the measurement data cleaned in the early stage (the standard distribution database is a process which is supplemented continuously, the standard distribution database mainly comprises 1107 cleaned data of indexes, and the data storage format of each index is such as that the standard distribution database format of hemoglobin _ measurement comprises the index names, hemoglobin _ measurement, gender _ classification and age _ measurement, and the distribution data condition of hemoglobin of the represented population is used for the distribution comparison of the data of the subsequent population to-be cleaned corresponding to the indexes);
2. the medical index data to be cleaned is processed by an algorithm to obtain a data list of non-pure numerical forms (the content list of the eg. Measurement index 'height _ measurement' comprises non-pure numerical forms such as '165 cm', '180 cm' and the like), the algorithm developed by the invention is required to carry out regular matching (a character string matching mode is described, and the non-numerical forms are removed when being judged), the illegal numerical forms are corrected to obtain pure measurement data (the content list of the eg. Measurement index 'height _ measurement' is corrected to be pure numerical forms such as '165', '180 cm' and the like from the original '165 cm', '180 cm' and the like), meanwhile, the algorithm logic relation judgment is carried out according to the legal range given by a standard term table, the content of the measurement index which is larger than or smaller than a reference value and larger than the reference value is removed, and the data of the measurement data of the range legal range are obtained;
3. similarity comparison is carried out on medical index data to be confirmed and a medical standard term distribution database, a whole column of data under the same index of the data of the same institution is extracted according to a corresponding medical reference range in a reference range data table given by an original institution, the extracted data are compared with the data of a corresponding standard distribution database in a standard term library, relevant parameters of the extracted data and the data of the standard term library are counted, a relevant coefficient is made to be r, median of the extracted data and the data of the standard term library are respectively m 1 And m 2 A is a in each case 1 And a 2 Three quarters each being b 1 And b 2 Calculating a weight value through the statistical related parameters; the calculation method of the weight value w is as follows: w = r 10- (m) 1 -m 2 )*3-(a 1 -a 2 )/a 1 *3-(b 1 -b 2 )/b 1 *3。
Then, displaying and comparing a histogram of data volume total quantity form frequency statistics, a box plot of data distribution and a density distribution diagram of data, and recommending index names in a most similar standard term library by an algorithm according to weight values; if the indexes to be cleaned do not exist in the existing standard distribution database, only performing distribution display of the indexes, calculating relevant parameters, generating a box line graph, a density distribution graph and the like of the indexes, and forming a correlation statistical result and a distribution graph;
4. after a correlation statistical result and a distribution diagram generated for index values under the same unit under each index are obtained, a final metering data index name and a corresponding correct reference range are recommended according to a weight value and a distribution form, the standardization of the metering index to be cleaned is realized through an algorithm, and then the metering index is converted into data of a corresponding grade form with large granularity according to a medical reference range (such as hemoglobin _ metering, the correct medical reference range is 120-175 and needs to be converted into hemoglobin _ grade, the conversion rule is 1: [0, 120) | | | | | | |2: [120, 175] | | | | | | | | | | |3 (175, infinity), | | | | | represents or, the data of the index is converted into the graded form according to the conversion rule: 1 (indicating low) or 2 (indicating normal) or 3 (indicating high) for subsequent data cleansing;
5. the cleaned metering data are combined according to the same term row, distribution display and system error quality inspection are carried out, system error correction of metering data is mainly used for error analysis of numerical variables after data cleaning, abnormal indexes are found by comparing the difference between basic information of each numerical data and total sample data information in each mechanism, specific implementation logic comprises the steps of cutting the data of each mechanism into each data according to indexes, then calculating the average value, median and standard deviation of the indexes of the total samples, meanwhile calculating the average value, median and standard deviation of the indexes of each mechanism, then calculating the difference ratio of each mechanism and the total samples, obtaining unused voting results by setting difference ratio threshold values of 10%,15% and 20%, and further finding out the difference indexes according to the voting results.
Merging the grade information with the same medical significance in a plurality of fourth independent data sets to form a fifth merged data set, namely, the numerical value under the index is uniformly reclassified according to the medical significance no matter the original unit and the normal value range, so that a large amount of homogeneity data is provided for the following medical analysis;
and (4) checking conflict data which can not be combined, namely errors occurring in the previous cleaning process or anomalies carried by the data, and obtaining a final metering information database after the correction is finished.
In another embodiment of the invention:
as shown in fig. 1, the data in the original medical database are in a mixed state and are marked as metering, level and illegal word information according to the information classification principle;
as shown in fig. 2, the pure metering information database is finally obtained by further identifying and processing the metering information, and includes the following specific steps:
step 1: removing grade information (+, -, yin and Yang) and illegal words (undetected) which do not belong to the metering information, and enabling the grade information (+, -, yin and Yang) and the illegal words to enter a corresponding standardized cleaning process to obtain a second metering information data set;
step 2: according to the normal value range corresponding to each numerical value provided by the hospital, the normal value ranges of the four numerical values such as 0.01,0.02,0.6 and 0.7 are all 0-0.5, so that the numerical values of more than or equal to 0.5 are positive in a medical sense and are marked as 2; a < 0.5 is normal, i.e. negative, and is marked 1; thereby writing a class a mapping rule.
And 3, step 3: dividing the numerical values sharing the same normal value range together according to the A-type mapping rule to form a third metering information data set; there are two normal value ranges (0-0.5, 0-1) in FIG. 2 so two third metrology information data sets are formed;
and 4, step 4: on the basis of the third metering information data set, all numerical values are converted into hierarchical data with medical significance, namely 1,2, according to the corresponding normal value range to form a fourth independent data set;
and 5, step 5: and merging the data in the fourth independent data sets derived from the plurality of normal value ranges to obtain a total database with medical significance labels, namely a fifth merged data set. In principle, each person should have only one test record, so that he/she can have only one value, which corresponds to a label with only one medical meaning, i.e. 1 or 2. However, if an error occurs in the previous cleaning process or the patient himself/herself has two conflicting test results (anomalies carried by the data itself), a merge conflict will result;
and 6, a step of: and (5) checking the merging conflict, namely returning the original data set to find the checking data of the patient, and correcting after confirming the source of the conflict. And finally obtaining a cleaned pure metering information database.
An embodiment of the present invention further provides a device for processing medical data of a metering information type, including:
the reading module is used for reading metering information with characteristic marks from the collected medical data of the mixed data type;
the abnormal data cleaning module is used for extracting abnormal data from the metering information with the characteristic mark and clearing the abnormal data;
the multi-unit characteristic processing module is used for independently extracting the original metering information marked in the same legal metering range, so that the extracted original metering information forms an independent data set;
a conversion module: the system is used for converting the original metering information in the independent data set into corresponding hierarchical information;
the merging module is used for merging the hierarchical information generated by the independent data sets and merging and marking merging conflicts on the metering information generated by the independent data sets;
and the statistical analysis module is used for checking the error of the corrected metering information system and marking the unqualified metering information.
It will be appreciated that modifications and variations are possible to those skilled in the art in light of the above teachings, and it is intended to cover all such modifications and variations as fall within the scope of the appended claims.

Claims (9)

1. A method of processing medical data of the type having metering information, the method comprising the steps of:
acquiring metering information in a medical database, wherein the metering information is pure numerical data under each medical index in the medical database;
cleaning and integrating abnormal data in the metering information;
according to the legal metering range corresponding to the medical institution detection instrument, performing multi-unit characteristic processing on the medical index in the metering information;
the method comprises the steps of independently extracting original metering information marked by the same unit characteristic and the same legal metering range under the same medical index to form an independent data set;
carrying out grading information processing on the metering information of the independent data set;
merging the metering information and correcting the conflict;
carrying out statistical analysis on the corrected metering information to obtain the managed metering information;
the method for processing the multi-unit characteristics of the indexes in the metering information comprises the following specific steps:
according to the fact that the same medical index has different legal metering ranges, format standardization of multiple legal metering ranges is conducted on the same medical index;
adding corresponding hierarchical characteristics to multiple legal metering ranges according to different medical meanings represented by metering information under the medical index distributed in the legal metering ranges;
the metering information marked as the same legal metering range forms the same unit characteristic, and the unit characteristic is added to the corresponding medical index, so that the same medical index has a plurality of unit characteristics; the specific algorithm comprises the following steps:
1) Standardizing physical examination data index names of different sources according to an international term set, establishing a user-defined standard term set on the basis of the standardized term set, and constructing a standard distribution database of the metering information indexes by using the metering information data cleaned at the early stage;
2) Processing the data to be cleaned by an algorithm to obtain a data list with a non-pure numerical form, performing regular matching by the algorithm to correct an illegal numerical form to obtain pure metering information data, and meanwhile, performing algorithm logic relation judgment according to a legal range given by a user-defined standard glossary, clearing the contents smaller than a lower limit of a reference value and larger than an upper limit of the reference value in a metering index to obtain the data of the metering information in the legal range;
3) Similarity comparison is carried out on medical measurement data to be confirmed and a medical standard term distribution database, a whole column of data under the same index of the data of the same institution is extracted according to a corresponding medical reference range in a reference range data table given by an original institution, the extracted data is compared with the data of a corresponding standard distribution database in a standard term library, relevant parameters of the extracted data and the data of the standard term library are counted, a relevant coefficient is made to be r, median of the extracted data and the data of the standard term library are respectively m 1 And m 2 A is a in each of the quartiles 1 And a 2 Three quarters each being b 1 And b 2 Calculating a weight value through the statistical related parameters; the calculation method of the weight value w is as follows: w = r 10- (m) 1 - m 2 )* 3 - (a 1 -a 2 )/a 1 * 3- (b 1 -b 2 )/b 1 *3; then, displaying and comparing a histogram of data quantity total quantity morphological frequency statistics, a box line graph of data distribution and a density distribution graph of data, and recommending index names in the most similar standard term library by an algorithm according to weight values; if the index to be cleaned does not exist in the existing standard distribution database, only performing distribution display of the index, calculating relevant parameters, generating a box line graph and a density distribution graph of the index, and forming a correlation statistical result and a distribution graph;
4) After obtaining a correlation statistical result and a distribution diagram generated for the index values of the same unit under each index, recommending the final metering data index name and the corresponding correct reference range according to the weight value and the distribution form, realizing the standardization of the index values through an algorithm, converting the index values into data of the corresponding grade form with large granularity according to the medical reference range, and converting the data of the index into the graded form according to a conversion rule: 1 represents low, 2 represents normal, and 3 represents high, and is used for subsequent data cleaning;
5) And merging the cleaned metering information data according to the same term column, and performing distribution display and system error quality inspection.
2. The method for processing medical data of a metering information type according to claim 1, wherein the method for acquiring the metering information in the medical database comprises the following specific steps:
adding a characteristic mark to the metering information collected from the cooperative medical institution, and extracting the metering information from the medical database according to the characteristic mark to form an independent metering information database.
3. The method for processing medical data of a metering information type according to claim 2, wherein the specific method for adding the feature marks to the metering information is as follows: in the process of column name standardization, a characteristic mark is added to each column of metering information for marking the data type of the column.
4. The method for processing medical data of a measurement information type according to claim 1, wherein the abnormal data in the method is specifically:
the abnormal data is non-pure numerical type data in the medical database, and comprises text information of pure text, grade information of pure grade, illegal information without specific meaning, and mixed information of any one or more types of information and numerical type information.
5. The method for processing medical data of a metering information type according to claim 1, wherein the method for performing multi-unit feature processing on the medical index in the metering information further comprises a method for generating a class a mapping table:
according to the normal value range corresponding to the index generated by a certain medical index provided by a medical institution under different detection methods and different detection batches, a rule table, namely an A-type mapping table, corresponding to the numerical value of each metering information and used for judging the medical significance of the metering information is formed so as to mark the medical significance behind the numerical value of each metering information; and according to the A-type mapping table, independently extracting the numerical value marked in the same normal value range under the medical index.
6. The method for processing medical data of a measurement information type according to claim 1, wherein the measurement information is subjected to a hierarchical information processing, and the method comprises:
and converting the original metering information into corresponding hierarchical information according to the hierarchical characteristics, then merging the hierarchical information generated by each independent data set, and finally converting all the metering information of the same unit characteristic under the same medical index into the corresponding hierarchical information.
7. The method for processing medical data of a metering information type according to claim 1, wherein the method for performing merging conflict correction on the metering information specifically comprises:
merging the original metering information in the independent data sets with the same medical index and the same unit characteristic, marking the metering information corresponding to two or more same medical indexes and the same unit characteristic of the same patient as a merging conflict, and finally selecting the only and correct metering information from the merging conflict.
8. The method for processing medical data of a metering information type according to claim 1, wherein the statistical analysis of the corrected metering information is performed by:
and (3) carrying out system error check on the corrected metering information and the metering information which is collected and cleaned by other cooperative medical institutions and has the same index and the same unit characteristic, marking unqualified metering information according to the consistency definition of the statistical field, and further correcting the marked metering information to obtain the treated metering information after the qualified metering information is confirmed.
9. A device for processing medical data of the metering information type, which employs the method for processing medical data of the metering information type according to any one of claims 1 to 8, characterized in that the device comprises the following modules:
the reading module is used for reading metering information with characteristic marks from the collected medical data of the mixed data type;
the abnormal data cleaning module is used for extracting abnormal data from the metering information with the characteristic mark and clearing the abnormal data;
the multi-unit characteristic processing module is used for independently extracting the original metering information marked in the same legal metering range to enable the extracted original metering information to form an independent data set;
the conversion module is used for converting the original metering information in the independent data set into corresponding hierarchical information;
the merging module is used for merging the hierarchical information generated by the independent data sets and merging and marking merging conflicts on the metering information generated by the independent data sets;
the statistical analysis module is used for checking the corrected metering information system errors and marking unqualified metering information;
the implementation method of the multi-unit feature processing module comprises the following steps:
according to the fact that the same medical index has different legal metering ranges, format standardization of multiple legal metering ranges is conducted on the same medical index;
adding corresponding hierarchical characteristics to multiple legal metering ranges according to different medical meanings represented by metering information under the medical index distributed in the legal metering ranges;
the metering information marked as the same legal metering range forms the same unit characteristic, and the unit characteristic is added to the corresponding medical index, so that the same medical index has a plurality of unit characteristics; the specific algorithm comprises the following steps:
1) Standardizing physical examination data index names of different sources according to an international term set, establishing a user-defined standard term set on the basis of the standardized term set, and constructing a standard distribution database of the metering information indexes by using the metering information data cleaned at the early stage;
2) Processing the data to be cleaned by an algorithm to obtain a data list with a non-pure numerical form, carrying out regular matching by the algorithm to correct an illegal numerical form to obtain pure metering information data, meanwhile, carrying out algorithm logic relation judgment according to a legal range given by a user-defined standard glossary, clearing the contents smaller than the lower limit of a reference value and larger than the upper limit of the reference value in a metering index to obtain the data of the metering information in the legal range;
3) Similarity comparison is carried out on medical measurement data to be confirmed and a medical standard term distribution database, data are extracted according to a corresponding medical reference range in a reference range data table given by an original mechanism from an entire column of data under the same index of the data of the same mechanism, then the extracted data are compared with the data of the corresponding standard distribution database in the standard term database, relevant parameters of the extracted data and the data of the standard term database are counted, a relevant coefficient is made to be r, median of the extracted data and the data of the standard term database are respectively m 1 And m 2 A is a in each case 1 And a 2 Three quarters each being b 1 And b 2 Calculating a weight value through the statistical related parameters; the calculation method of the weight value w is as follows: w = r 10- (m) 1 - m 2 )* 3 - (a 1 -a 2 )/a 1 * 3- (b 1 -b 2 )/b 1 *3; then, carrying out display and comparison on a histogram of data quantity total quantity form frequency statistics, a box line graph of data distribution and a density distribution graph of data, and carrying out an algorithmRecommending index names in the most similar standard term library according to the weight values; if the index to be cleaned does not exist in the existing standard distribution database, only performing distribution display of the index, calculating relevant parameters, generating a box line graph and a density distribution graph of the index, and forming a correlation statistical result and a distribution graph;
4) After obtaining a correlation statistical result and a distribution diagram generated for the index value under the same unit under each index, recommending a final metering data index name and a corresponding correct reference range according to a weight value and a distribution form, realizing the standardization of the index value through an algorithm, converting the index value into corresponding grade form data with large granularity according to a medical reference range, and converting the index data into a graded form according to a conversion rule: 1 represents low, 2 represents normal, and 3 represents high, and is used for subsequent data cleaning;
5) And merging the cleaned metering information data according to the same term column, and performing distribution display and system error quality inspection.
CN202110088239.9A 2021-01-22 2021-01-22 Method and device for processing medical data of metering information type Active CN112768058B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110088239.9A CN112768058B (en) 2021-01-22 2021-01-22 Method and device for processing medical data of metering information type

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110088239.9A CN112768058B (en) 2021-01-22 2021-01-22 Method and device for processing medical data of metering information type

Publications (2)

Publication Number Publication Date
CN112768058A CN112768058A (en) 2021-05-07
CN112768058B true CN112768058B (en) 2022-12-02

Family

ID=75705633

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110088239.9A Active CN112768058B (en) 2021-01-22 2021-01-22 Method and device for processing medical data of metering information type

Country Status (1)

Country Link
CN (1) CN112768058B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113821503A (en) * 2021-09-23 2021-12-21 北京金山云网络技术有限公司 Medical data processing method and device and edge server
CN116030950B (en) * 2023-03-27 2023-06-23 武汉大学人民医院(湖北省人民医院) Medical data integration management method

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109524088A (en) * 2018-10-27 2019-03-26 平安医疗健康管理股份有限公司 Medical monitoring method, device, terminal and medium based on data visualization

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130030260A1 (en) * 2011-07-28 2013-01-31 Sean Hale System and method for biometric health risk assessment
CN107330297A (en) * 2017-07-19 2017-11-07 湖南暄程科技有限公司 A kind of emergency system and method
CN108960445A (en) * 2018-06-29 2018-12-07 中国南方电网有限责任公司超高压输电公司检修试验中心 A kind of direct current grounding pole method for evaluating state based on Set Pair Analysis
CN109509517A (en) * 2018-10-16 2019-03-22 华东理工大学 A kind of medical test Index for examination modified method automatically
CN111755121A (en) * 2019-03-27 2020-10-09 北京菁医林国际医院管理有限公司 Method and device for evaluating physical development trend of juveniles
CN111508574A (en) * 2019-04-19 2020-08-07 上海智众医疗科技有限公司 Inspection and inspection data docking method and system
CN110010250B (en) * 2019-04-29 2023-05-26 青岛科技大学 Cardiovascular disease patient weakness grading method based on data mining technology
CN110867237A (en) * 2019-11-15 2020-03-06 曹庆恒 Method, system and equipment for managing rule base of reasonable and compliant medication system

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109524088A (en) * 2018-10-27 2019-03-26 平安医疗健康管理股份有限公司 Medical monitoring method, device, terminal and medium based on data visualization

Also Published As

Publication number Publication date
CN112768058A (en) 2021-05-07

Similar Documents

Publication Publication Date Title
CN111540468B (en) ICD automatic coding method and system for visualizing diagnostic reasons
CN112768058B (en) Method and device for processing medical data of metering information type
CN109036577B (en) Diabetes complication analysis method and device
CN110911009A (en) Clinical diagnosis aid decision-making system and medical knowledge map accumulation method
CN108280149A (en) A kind of doctor-patient dispute class case recommendation method based on various dimensions tag along sort
CN112541066B (en) Text-structured-based medical and technical report detection method and related equipment
CN114864099B (en) Clinical data automatic generation method and system based on causal relationship mining
US11449680B2 (en) Method for testing medical data
CN110634546A (en) Electronic medical record text standardization detection method
CN111243753A (en) Medical data-oriented multi-factor correlation interactive analysis method
CN115910364A (en) Medical inspection quality control model training method, medical inspection quality control method and system
Richetelli et al. Forensic footwear reliability: part III—positive predictive value, error rates, and inter‐rater reliability
CN106951710B (en) CAP data system and method based on privilege information learning support vector machine
Liyanage et al. Ontology to identify pregnant women in electronic health records: primary care sentinel network database study
CN117131449A (en) Data management-oriented anomaly identification method and system with propagation learning capability
CN115205601A (en) Medical examination result auditing system based on artificial intelligence and knowledge graph
CN111640517A (en) Medical record encoding method and device, storage medium and electronic equipment
CN116913549A (en) Adverse reaction event early warning method, device, system and electronic equipment
CN112768059B (en) Method for standardizing grade data in medical data
CN116110542A (en) Data analysis method based on trusted multi-view
CN115512810A (en) Data management method and system for medical image data
CN115938608A (en) Clinical decision early warning method and system based on prompt learning model
Barnett Automated detection of over-and under-dispersion in baseline tables in randomised controlled trials
CN115762769A (en) Intelligent risk early warning system after ERCP (effective Range planning) operation
Nistal-Nuño Artificial intelligence forecasting mortality at an intensive care unit and comparison to a logistic regression system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant