CN111415749A - Information processing method, information processing apparatus, and computer-readable storage medium - Google Patents

Information processing method, information processing apparatus, and computer-readable storage medium Download PDF

Info

Publication number
CN111415749A
CN111415749A CN202010168641.3A CN202010168641A CN111415749A CN 111415749 A CN111415749 A CN 111415749A CN 202010168641 A CN202010168641 A CN 202010168641A CN 111415749 A CN111415749 A CN 111415749A
Authority
CN
China
Prior art keywords
information
attribute information
quality control
medical health
control rule
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010168641.3A
Other languages
Chinese (zh)
Inventor
王利
宋志朋
罗英群
吕令广
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE ICT Technologies Co Ltd
Original Assignee
ZTE ICT Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE ICT Technologies Co Ltd filed Critical ZTE ICT Technologies Co Ltd
Priority to CN202010168641.3A priority Critical patent/CN111415749A/en
Priority to PCT/CN2020/096108 priority patent/WO2021179461A1/en
Publication of CN111415749A publication Critical patent/CN111415749A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Data Mining & Analysis (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention provides an information processing method, an information processing apparatus, and a computer-readable storage medium. The information processing method includes the steps of: acquiring entity information existing in a data model of any medical health data source and attribute information contained in the entity information; searching in the data model of another at least one medical health data source to obtain target attribute information matched with the attribute information; applying the data quality control rule information of the target attribute information to the attribute information, and calculating the proportion of the attribute information which accords with the data quality control rule information; comparing the proportional calculation result with a proportional calculation result threshold value; and adding the data quality control rule information to the data quality control rule information set of the attribute information according to the comparison result. The technical scheme of the invention can accurately and efficiently transfer the data quality control information in one medical health database to another medical health database.

Description

Information processing method, information processing apparatus, and computer-readable storage medium
Technical Field
The present invention relates to the technical field of medical health data information processing, and in particular, to an information processing method, an information processing apparatus, and a computer-readable storage medium.
Background
High-quality electronic medical health data is an important basis for subsequent deep analysis mining utilization. In the process of information system construction, due to the fact that data obtaining, inputting, transferring, loading, integrating, maintaining and other links are abnormal or wrong, data quality problems such as errors, inconsistency, repetition and the like are difficult to avoid. Meanwhile, medical health data is dispersed in a plurality of information systems, and heterogeneity exists between different information systems, so that a process of manually developing data quality control from scratch for all the systems takes much time and cost.
Therefore, one of the technical problems to be solved in the art is how to accurately and efficiently migrate data quality control information in one medical health database to another medical health database.
Moreover, any discussion of the prior art throughout the specification is not an admission that the prior art is necessarily known to a person of ordinary skill in the art, and any discussion of the prior art throughout the specification is not an admission that the prior art is necessarily widely known or forms part of common general knowledge in the field.
Disclosure of Invention
The present invention is directed to solving at least one of the problems of the prior art or the related art.
To this end, a first object of the present invention is to provide an information processing method.
A second object of the present invention is to provide an information processing apparatus.
A third object of the present invention is to provide a computer-readable storage medium.
To achieve the first object of the present invention, an embodiment of the present invention provides an information processing method, adapted to migrate data quality control rule information in medical health data between at least two medical health data sources, the information processing method including the steps of: acquiring entity information existing in a data model of any medical health data source and attribute information contained in the entity information; searching in a data model of at least one other medical health data source relative to any one medical health data source aiming at the entity information and the attribute information to obtain target attribute information matched with the attribute information; applying the data quality control rule information of the target attribute information to the attribute information, and calculating the proportion of the attribute information which accords with the data quality control rule information to obtain a proportion calculation result; comparing the proportional calculation result with a proportional calculation result threshold value; and adding the data quality control rule information to the data quality control rule information set of the attribute information according to the comparison result.
According to the embodiment, the data quality control rule information can be accurately and efficiently migrated from one medical health data source to another medical health data source, the utilization rate and accuracy of the medical health data are improved, and the problem that a large amount of time and cost are needed in the process of manually developing the data quality control from the beginning for all systems is solved.
In addition, the technical scheme provided by the invention can also have the following additional technical characteristics:
in the above technical solution, before the step of acquiring the entity information and the attribute information included in the entity information existing in the data model of any medical health data source is executed, the information processing method further includes the steps of: aiming at each medical health data source, establishing a data model corresponding to each medical health data source; the data model comprises at least one entity information, and any entity information comprises at least one attribute information.
Before the migration of the data quality control rule information, the data models respectively corresponding to the medical health data sources are firstly ensured to be established, so that the smooth implementation of the migration is ensured.
In any of the above technical solutions, before the step of acquiring the entity information and the attribute information included in the entity information existing in the data model of any medical health data source is executed, the information processing method further includes the steps of: aiming at each medical health data source, establishing a data quality control rule information set corresponding to each medical health data source; wherein the set of data quality control rule information includes at least one piece of data quality control rule information.
And aiming at each medical health data source, establishing at least one data quality control rule information set corresponding to each medical health data source respectively so as to ensure the smooth matching and migration of the subsequent data quality control rule information.
In any of the above technical solutions, the step of retrieving, for the entity information and the attribute information, from a data model of at least one other medical health data source relative to any one of the medical health data sources to obtain target attribute information matched with the attribute information includes: aiming at the entity information, performing first retrieval in a data model of another at least one medical health data source to obtain a target entity information retrieval result; and performing second retrieval according to the target entity information retrieval result to acquire target attribute information.
The embodiment first searches and obtains entity information corresponding to each other, so as to retrieve and obtain attribute information corresponding to each other from among the entity information.
In any of the above technical solutions, the step of performing a second search according to the target entity information search result to obtain the target attribute information includes: judging that the target entity information retrieval result is the acquired target entity information matched with the entity information, and performing second retrieval by taking the target entity information as a range according to the attribute information to acquire target attribute information; or judging that the target entity information retrieval result is that target entity information matched with the entity information is not obtained, and performing second retrieval by taking all entity information in the data model of the other at least one medical health data source as a range according to the attribute information to obtain the target attribute information.
By implementing the second retrieval and implementing different subsequent retrieval matching modes according to different second retrieval results, the embodiment not only ensures the efficiency of retrieval matching, but also ensures the success rate of retrieval matching.
In any of the above technical solutions, the first search is a fuzzy matching search; and/or the second search is a fuzzy matching search.
The fuzzy matching retrieval can improve the problem of low retrieval accuracy rate caused by small name difference.
In any of the above technical solutions, the step of adding the data quality control rule information to the data quality control rule information set of the attribute information according to the comparison result includes: and judging that the comparison result is that the proportion calculation result is larger than the proportion calculation result threshold value, and adding the data quality control rule information into the data quality control rule information set of the attribute information.
The embodiment can ensure the reasonability and the matching degree of the data quality control rule information during the migration, and avoid migrating the data quality control rule information of the attribute information which does not correspond or match.
In any of the above technical solutions, the proportional computation result threshold is determined according to the matching result of the entity information; and/or the ratio calculation result threshold is determined according to the matching result of the attribute information.
The purpose of this embodiment is to set different proportional calculation result thresholds according to actual conditions or needs, especially matching conditions of entity information and attribute information, so as to ensure reasonable degree of calculation results.
To achieve the second object of the present invention, an embodiment of the present invention provides an information processing apparatus including: a memory storing a computer program; a processor executing a computer program; wherein the processor, when executing the computer program, implements the steps of the information processing method according to any of the embodiments of the present invention.
The information processing apparatus provided in the embodiment of the present invention implements the steps of the information processing method according to any embodiment of the present invention, and thus has all the advantages of the information processing method according to any embodiment of the present invention, which are not described herein again.
To achieve the third object of the present invention, an embodiment of the present invention provides a computer-readable storage medium including: the computer-readable storage medium stores a computer program that, when executed, implements the steps of the information processing method according to any one of the embodiments of the present invention.
The computer-readable storage medium provided in the embodiments of the present invention implements the steps of the information processing method according to any embodiment of the present invention, so that the computer-readable storage medium has all the advantages of the information processing method according to any embodiment of the present invention, and details are not described herein again.
Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
The above and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 shows a flow chart of a first step of an information processing method of an embodiment of the invention;
FIG. 2 shows a flow chart of a second step of an information processing method of an embodiment of the invention;
FIG. 3 shows a flow chart of a third step of an information processing method of an embodiment of the present invention;
FIG. 4 shows a flow chart of a fourth step of an information processing method of an embodiment of the present invention;
FIG. 5 shows a flow chart of a fifth step of an information processing method of an embodiment of the present invention;
FIG. 6 shows a flow chart of a sixth step of an information processing method of an embodiment of the present invention;
FIG. 7 is a system composition diagram showing an information processing apparatus of an embodiment of the present invention;
fig. 8 shows a flowchart of a seventh step of the information processing method of the embodiment of the present invention;
wherein, the correspondence between the reference numbers and the names of the components in fig. 7 is:
100: information processing apparatus, 110: memory, 120 processor.
Detailed Description
In order that the above objects, features and advantages of the present invention can be more clearly understood, a more particular description of the invention will be rendered by reference to the appended drawings. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced in other ways than those specifically described herein, and therefore the scope of the present invention is not limited by the specific embodiments disclosed below.
Information processing methods, information processing apparatuses, and computer-readable storage media according to some embodiments of the present invention are described below with reference to fig. 1 to 8.
Example one
As shown in fig. 1, the present embodiment provides an information processing method, which is suitable for migrating data quality control rule information in medical health data between at least two medical health data sources, and the information processing method includes the following steps:
step S102, acquiring entity information existing in a data model of any medical health data source and attribute information contained in the entity information;
step S104, aiming at the entity information and the attribute information, searching in a data model of at least one other medical health data source relative to any one medical health data source to obtain target attribute information matched with the attribute information;
step S106, applying the data quality control rule information of the target attribute information to the attribute information, and calculating the proportion of the attribute information which accords with the data quality control rule information to obtain a proportion calculation result;
step S108, comparing the size of the proportional calculation result with the proportional calculation result threshold;
step S110, adding the data quality control rule information to the data quality control rule information set of the attribute information according to the comparison result.
The purpose of this embodiment is to achieve accurate and efficient migration of data quality control rule information from one medical health data source to another. Because the construction of the medical health information system needs to follow national, industrial standard, medical health field knowledge and the like, the semantics of the information contained in the data model in the information system are often similar, so that the data items with similar semantics in different information systems, the data quality control requirements and the corresponding data quality control rules are also generally similar. Thus, by retrieving a match, it is operational to transfer data quality control rule information from one medical health data source to another.
In order to implement the migration of the data quality control rule information, it is first required to acquire entity information existing in the data model of any medical health data source and attribute information included in the entity information. The medical health data sources in the market at present are various, and the types of the data sources are also different, for example, the medical health data sources may include medical health data sources in various structured forms, such as a relational database, an xml document, a json document, and the like. For each data source, a corresponding data model needs to be established. The data model is composed of entity information and attribute information, and the entity information comprises one or more attribute information. The entity information and the attribute information respectively have a plurality of metadata items, such as an entity name, an entity synonymous name, an attribute synonymous name, an attribute data type, an attribute data length, an attribute value range, an attribute value dictionary and the like. Only if the entity information and the attribute information contained in the entity information are obtained, some or some attribute information in the entity information can be retrieved and matched in the subsequent steps so as to realize the corresponding data quality control rule information migration. For example, in a data model of a medical health data source, the basic information of a patient may be entity information, and the number, name, age, and sex of the patient may be attribute information.
After the entity information and the attribute information included in the entity information are acquired, the entity information and the attribute information need to be retrieved from the data model of at least one other medical health data source relative to any one medical health data source so as to acquire target attribute information matched with the attribute information. The purpose of this step is to obtain the same or corresponding attribute information in two different medical health data sources, so as to achieve the purpose of adding the data quality control rule information of the attribute information in one medical health data source to the data quality control rule information set of the attribute information in the other medical health data source.
For example, assuming that the entity information about the basic information of the patient includes the attribute information of the patient identification number in the data model of one healthcare data source, this step needs to find the attribute information corresponding to the attribute information of the patient identification number in the data model of another healthcare data source, so as to add the data quality control rule information "GB 11643-1999" of the attribute information of the patient identification number to the data quality control rule information set of the corresponding attribute information in another healthcare data source.
In order to ensure the accuracy of migration, in this embodiment, after the data quality control rule information included in the target attribute information is applied to the attribute information, the proportion conforming to the data quality control rule information in the attribute information needs to be calculated to obtain a proportion calculation result. The calculation result is used for measuring whether the data quality control rule information can be migrated. After the scaling result is compared with the scaling result threshold, if it is known that the scaling result is reasonable, the data quality control rule information may be added to the data quality control rule information set of the attribute information to complete the migration of one or more data quality control rule information.
According to the embodiment, the data quality control rule information can be accurately and efficiently migrated from one medical health data source to another medical health data source, the utilization rate and accuracy of the medical health data are improved, and the problem that a large amount of time and cost are needed in the process of manually developing the data quality control from the beginning for all systems is solved.
Example two
As shown in fig. 2, the present embodiment provides an information processing method, and in addition to the technical features of any of the above embodiments, the present embodiment further includes the following technical features.
Before the step of acquiring the entity information existing in the data model of any one of the medical health data sources and the attribute information contained in the entity information is executed, the information processing method further comprises the following steps:
step S202, aiming at each medical health data source, establishing a data model respectively corresponding to each medical health data source.
The data model comprises at least one entity information, and any entity information comprises at least one attribute information.
The establishment of the data model is one of the indispensable links in the construction of medical health data sources. Before the migration of the data quality control rule information, the present embodiment first needs to ensure that the data models respectively corresponding to the medical health data sources are established.
EXAMPLE III
As shown in fig. 3, the present embodiment provides an information processing method, and in addition to the technical features of any of the above embodiments, the present embodiment further includes the following technical features.
Before the step of acquiring the entity information existing in the data model of any one of the medical health data sources and the attribute information contained in the entity information is executed, the information processing method further comprises the following steps:
step S302, aiming at each medical health data source, a data quality control rule information set corresponding to each medical health data source is established.
Wherein the set of data quality control rule information includes at least one piece of data quality control rule information.
The data quality control rule information refers to information that measures information access rules or conforms to standards among the medical health data. For example, for patient basic information in case information, it is necessary to register the sex of the patient. And the statistical rules of gender need to comply with the national standard GB/T2261.1-2003. One of the data quality control rule information belonging to the medical health data is the national standard GB/T2261.1-2003. For example again, for the patient basic information in the case information, the identification number of the patient needs to be registered. And the statistical rules of the identification number need to follow the national standard GB 11643-1999. One of the data quality control regulation information belonging to the medical health data is the national standard GB 11643-1999.
In this embodiment, for a plurality of attribute information in one data model, it is not required that each attribute information has corresponding data quality control rule information or a data quality control rule information set. The purpose of this embodiment is to search and match the attribute information having the data quality control rule information in the data model of one medical health data source with the attribute information having the data quality control rule information in the data model of another medical health data source, so as to complete the migration of the corresponding data quality control rule information. In addition, the present embodiment does not require that the two attribute information belonging to different medical health data sources belong to the same or matching entity information.
And aiming at each medical health data source, establishing at least one data quality control rule information set corresponding to each medical health data source respectively, wherein the data quality control rule information set is the basis of subsequent data quality control rule information matching migration.
Example four
As shown in fig. 4, the present embodiment provides an information processing method, and in addition to the technical features of any of the above embodiments, the present embodiment further includes the following technical features.
The step of searching in the data model of at least one other medical health data source relative to any one medical health data source aiming at the entity information and the attribute information to obtain the target attribute information matched with the attribute information comprises the following steps:
step S402, aiming at the entity information, performing first retrieval in the data model of another at least one medical health data source to obtain a target entity information retrieval result;
and S404, performing second retrieval according to the target entity information retrieval result to acquire target attribute information.
The embodiment provides a specific method for retrieving and matching attribute information. Since the attribute information belongs to or is included in the entity information, the embodiment first performs a first search in a data model of another at least one medical health data source for the entity information to obtain a target entity information search result. In other words, the embodiment first searches for and obtains the entity information that matches with each other, so as to retrieve and obtain the attribute information corresponding to each other from among the entity information.
EXAMPLE five
As shown in fig. 5, the present embodiment provides an information processing method, and in addition to the technical features of any of the above embodiments, the present embodiment further includes the following technical features.
The step of performing a second search according to the target entity information search result to obtain target attribute information includes:
step S502, judging that the target entity information retrieval result is the acquired target entity information matched with the entity information, and performing second retrieval by taking the target entity information as a range according to the attribute information to acquire the target attribute information; or
Step S504, the target entity information retrieval result is judged to be that the target entity information matched with the entity information is not obtained, and aiming at the attribute information, all entity information in the data model of the other at least one medical health data source is taken as a range, second retrieval is carried out, so that the target attribute information is obtained.
Specifically, there are two possibilities of the search result obtained by the first search. One possibility is that target entity information matching the entity information is acquired. Then, for the purpose of improving efficiency, the attribute information is further retrieved directly from the target entity information to obtain the required target attribute information from the target entity information. Another possibility is that target entity information matching the entity information is not obtained. The result shows that the attribute information respectively belonging to the two medical health data sources respectively belong to different entity information in the respective medical health data sources. In this case, the second search needs to be performed with all entity information in the data model of the at least one other medical health data source as a scope.
By implementing the second retrieval and implementing different subsequent retrieval matching modes according to different second retrieval results, the embodiment not only ensures the efficiency of retrieval matching, but also ensures the success rate of retrieval matching.
EXAMPLE six
The present embodiment provides an information processing method, and in addition to the technical features of any of the above embodiments, the present embodiment further includes the following technical features.
The first retrieval is fuzzy matching retrieval; and/or the second search is a fuzzy matching search.
In other words, the retrieval of entity information and attribute information between different medical health data sources may be performed by fuzzy matching. The fuzzy matching retrieval can improve the problem of low retrieval accuracy rate caused by small name difference.
EXAMPLE seven
As shown in fig. 6, the present embodiment provides an information processing method, and in addition to the technical features of any of the above embodiments, the present embodiment further includes the following technical features.
The step of adding the data quality control rule information to the data quality control rule information set of the attribute information according to the comparison result includes:
step S602, determining that the comparison result is that the ratio calculation result is greater than the ratio calculation result threshold, and adding the data quality control rule information to the data quality control rule information set of the attribute information.
It should be noted that, when the data quality control rule is calculated according to the proportion of the entity information and the attribute information among different medical health data sources, the attribute information whose value is taken as the data dictionary may be calculated after the value is converted into the value of the unified standard through the standard data dictionary.
The specific value of the threshold of the proportional calculation result can be selected and adjusted by those skilled in the art according to actual needs. And when the proportion calculation result is larger than the proportion calculation result threshold value, the data quality control rule information of the attribute information and the target attribute information can be mutually migrated, and in this case, the data quality control rule information is added into the data quality control rule information set of the attribute information. And when the proportion calculation result is smaller than or equal to the proportion calculation result threshold, indicating that the data quality control rule information of the attribute information and the target attribute information is not suitable for mutual migration, and in this case, manually adding the data quality control rule information.
The embodiment can ensure the reasonability and the matching degree of the data quality control rule information during the migration, and avoid migrating the data quality control rule information of the attribute information which does not correspond or match.
Example eight
The present embodiment provides an information processing method, and in addition to the technical features of any of the above embodiments, the present embodiment further includes the following technical features.
The proportional calculation result threshold is determined according to the matching result of the entity information; and/or the ratio calculation result threshold is determined according to the matching result of the attribute information.
Specifically, in this embodiment, when the entity information matches and the attribute information matches between different medical health data sources, the first proportional calculation result threshold is used as the proportional calculation result threshold. And when the entity information is not matched but the attribute information is matched between different medical health data sources, adopting a second proportional calculation result threshold value as a proportional calculation result threshold value. Wherein the first ratio calculation result threshold value and the second ratio calculation result threshold value are different from each other.
The purpose of this embodiment is to set different proportional calculation result thresholds according to actual conditions or needs, especially matching conditions of entity information and attribute information, so as to ensure reasonable degree of calculation results.
Example nine
As shown in fig. 7, the present embodiment provides an information processing apparatus 100 including: a memory 110 and a processor 120. The memory 110 stores a computer program. The processor 120 executes the computer program. Wherein the processor 120, when executing the computer program, implements the steps of the information processing method according to any of the embodiments of the present invention.
Example ten
The present embodiments provide a computer-readable storage medium, comprising: the computer-readable storage medium stores a computer program that, when executed, implements the steps of the information processing method according to any one of the embodiments of the present invention.
DETAILED DESCRIPTION OF EMBODIMENT (S) OF INVENTION
As shown in fig. 8, the present embodiment provides an information processing method adapted to migrate data quality control rule information in medical health data between at least two medical health data sources, the information processing method including the steps of:
step S802, establishing data models respectively corresponding to the medical health data sources aiming at the medical health data sources;
step S804, aiming at each medical health data source, establishing a data quality control rule information set respectively corresponding to each medical health data source;
step S806, acquiring entity information existing in the data model of any medical health data source and attribute information contained in the entity information;
step S808, aiming at the entity information, performing first retrieval in the data model of another at least one medical health data source to obtain a target entity information retrieval result;
step S810, performing a second retrieval according to the target entity information retrieval result to obtain target attribute information, and judging the target entity information retrieval result;
if the target entity information retrieval result is determined to be the acquired target entity information, step S812 is executed, and if the target entity information retrieval result is determined to be the non-acquired target entity information, step S814 is executed;
step S812, performing a second search with the target entity information as a range for the attribute information to obtain target attribute information;
step S814, aiming at the attribute information, taking all entity information in the data model of another at least one medical health data source as a range, and performing second retrieval to obtain target attribute information;
step S816, the data quality control rule information of the target attribute information is applied to the attribute information, and the proportion which accords with the data quality control rule information in the attribute information is calculated to obtain a proportion calculation result;
step S818, comparing the size of the proportional calculation result with the proportional calculation result threshold value;
if the comparison result is determined to be that the ratio calculation result is greater than the ratio calculation result threshold, then step S820 is executed, and if the comparison result is determined to be that the ratio calculation result is less than or equal to the ratio calculation result threshold, then step S822 is executed;
step S820, adding the data quality control rule information to the data quality control rule information set of the attribute information;
in step S822, data quality control rule information is manually added.
In particular, the present embodiment establishes a set of data quality control rules for each source of medical health data. And aiming at the first medical health data source, selecting each entity information in the data model of the first medical health data source, retrieving in the entity information of the data model of the data source, and retrieving the matched target entity information.
And for the first entity information, if the matched entity information in other medical health data sources is retrieved, the attribute information of the first entity information is used for retrieving in the attribute information of the matched entity information, and the matched attribute information is retrieved so as to obtain first target attribute information matched with the first attribute information.
And aiming at the first attribute information, after the matched first target attribute information in other medical health data sources is retrieved, if the matched first target attribute information has a first data quality control rule, applying the first data quality control rule of the matched first target attribute information to the first attribute information, and calculating the proportion of the first attribute information conforming to the first data quality control rule. For the first data quality control rule, if the ratio that the first attribute information conforms to is greater than the first threshold, the first data quality control rule is added to the set of data quality control rules for the first attribute information.
And aiming at the first entity information, if the matched entity information in other medical health data sources is not retrieved, retrieving the attribute information of the first entity information from the attribute information of all the entity information, and retrieving the matched target attribute information to obtain second target attribute information matched with the second attribute information.
And aiming at the second attribute information, after the matched target attribute information in other medical health data sources is retrieved, if the matched second target attribute information has a second data quality control rule, applying the second data quality control rule of the matched second target attribute information to the second attribute information, and calculating the proportion of the second attribute information conforming to the second data quality control rule. And for the second data quality control rule, if the ratio of the second attribute information is greater than a second threshold value, adding the second data quality control rule to the data quality control rule set of the second attribute information.
And for the third attribute information, if the matched target attribute information in other medical health data sources is not retrieved, manually establishing a data quality control rule.
For example, the information processing method of the present embodiment realizes migration of data quality control rule information as follows. Tables 1 and 2 respectively list the data models that each have among the two structured health data sources. The first medical health data source corresponding to the data model in table 1 is a relational database, and the second medical health data source corresponding to the data model in table 2 is a JSON document. The method comprises the steps that a data model is established for a first medical health data source, a data table is an entity of the data model, a field is an attribute of the data model, and a length, a data type, a constraint condition, a value range and a value dictionary of the field are metadata items of the attribute of the data model. And establishing a data model aiming at the second medical health data source, wherein in the JSON document, the object is an entity of the data model, and the object attribute is an entity attribute of the data model.
TABLE 1
Figure BDA0002408330010000131
Figure BDA0002408330010000141
TABLE 2
Figure BDA0002408330010000142
As shown in table 1, it is known that "sex" of attribute information of entity information "patient basic information" in the first medical health data source has a first data quality control rule, the first data quality control rule is "value needs to meet the requirements of GB/T2261.1-2003, 0 is unknown sex, 1 is male, 2 is female, 9 is unexplained sex", the attribute information "age" has a second data quality control rule, and the second data quality control rule is "value is an integer of 0 or more and 200 or less".
And establishing a data quality control rule set aiming at the second medical health data source, selecting entity information 'patient basic information' in the second medical health data source, retrieving in the entity information of the first medical health data source, and retrieving the entity information matched with the entity information 'patient basic information' in the first medical health data source. The attribute information in the entity information patient basic information in the second medical health data source is used for searching in the attribute information in the entity information patient basic information in the first medical health data source, and the attribute information matched with the attribute information is found and listed in the table 3.
TABLE 3
First medical health data source Second medical health data source
Numbering Numbering
Name (I) Name (I)
Sex Sex
Age (age) Age (age)
Identity card number
Wherein the attribute information of the existing data quality control rule in the first medical health data source is 'gender' and 'age'. And aiming at the attribute information of the entity information 'patient basic information' in the second medical health data source, applying the matched data quality control rule of the attribute information of the entity information in the first medical health data source to calculate the conformity proportion.
For the attribute information of 'sex' of the entity information 'patient basic information' in the second medical health data source, the conformity proportion is calculated to be 100% by applying the first data quality control rule. For the attribute information of the "age" of the entity information "patient basic information" in the second medical health data source, the conformity proportion is calculated to be 100% by applying the second data quality control rule. The corresponding threshold is set to 95%, and the above coincidence ratios are all larger than the corresponding threshold.
Thus, the first data quality control rule may be added to the set of data quality control rules for the "gender" attribute information of the entity information "patient base information" in the second medical health data source. Adding a second data quality control rule to the data quality control rule set of the "age" attribute information of the entity information "patient basic information" in the second medical health data source.
As shown in table 4, for the "number" attribute information and the "name" attribute information of the entity information "patient basic information" in the second medical health data source, the attribute information matched in the first medical health data source has no data quality control rule. Aiming at the attribute information of the 'identity card number' of the entity information 'patient basic information' in the second medical health data source, the first medical health data source has no matched attribute information.
Therefore, in the present embodiment, a data quality control rule is manually established for the attribute information of the "identity card number", for example, if the "identity card number needs to meet the requirements of GB 11643-.
TABLE 4
Numbering Name (I) Sex Age (age) Identity card number
001 Zhang three 1 35 11010120010101203x
002 Li four 2 46 110101200101019311
003 Wangwu tea 0 24 110101200101016110
004 Zhao liu xi 9 57 11010120010101975x
In summary, the embodiment of the invention has the following beneficial effects: according to the embodiment of the invention, the data quality control information in one medical health database can be accurately and efficiently transferred to the other medical health database, so that the utilization rate and the accuracy of the medical health data are improved, and the problem that a large amount of time and cost are needed in the process of manually developing the data quality control from the beginning for all systems is solved.
It should be noted that in the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes will occur to those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. An information processing method adapted to migrate data quality control rule information in medical health data between at least two medical health data sources, the information processing method comprising the steps of:
acquiring entity information existing in a data model of any medical health data source and attribute information contained in the entity information;
searching in a data model of at least one other medical health data source relative to the any one medical health data source aiming at the entity information and the attribute information to obtain target attribute information matched with the attribute information;
applying the data quality control rule information of the target attribute information to the attribute information, and calculating the proportion of the attribute information which accords with the data quality control rule information to obtain a proportion calculation result;
comparing the proportional calculation result with a proportional calculation result threshold value;
and adding the data quality control rule information to the data quality control rule information set of the attribute information according to the comparison result.
2. The information processing method according to claim 1, wherein before the step of acquiring entity information existing in a data model of any one of the medical health data sources and attribute information contained in the entity information is performed, the information processing method further comprises the steps of:
aiming at each medical health data source, establishing the data model corresponding to each medical health data source;
wherein the data model includes at least one of the entity information, and any of the entity information includes at least one of the attribute information.
3. The information processing method according to claim 1, wherein before the step of acquiring entity information existing in a data model of any one of the medical health data sources and attribute information contained in the entity information is performed, the information processing method further comprises the steps of:
aiming at each medical health data source, establishing a data quality control rule information set corresponding to each medical health data source;
wherein the set of data quality control rule information includes at least one of the data quality control rule information.
4. The information processing method according to claim 1, wherein the step of retrieving, for the entity information and the attribute information, among data models of at least one other medical health data source with respect to the one medical health data source to obtain target attribute information that matches the attribute information includes:
aiming at the entity information, performing a first retrieval in the data model of the at least one other medical health data source to obtain a target entity information retrieval result;
and performing second retrieval according to the target entity information retrieval result to acquire the target attribute information.
5. The information processing method according to claim 4, wherein the step of performing a second search to acquire the target attribute information according to the target entity information search result includes:
judging that the target entity information retrieval result is that target entity information matched with the entity information is acquired, and performing the second retrieval by taking the target entity information as a range according to the attribute information to acquire the target attribute information; or
And judging that the target entity information retrieval result is that target entity information matched with the entity information is not acquired, and performing the second retrieval by taking all entity information in the data model of the other at least one medical health data source as a range according to the attribute information to acquire the target attribute information.
6. The information processing method according to claim 4,
the first retrieval is fuzzy matching retrieval; and/or
The second search is a fuzzy matching search.
7. The information processing method according to any one of claims 1 to 6, wherein the step of adding the data quality control rule information to the set of data quality control rule information of the attribute information according to the comparison result includes:
and judging that the comparison result is that the proportion calculation result is larger than the proportion calculation result threshold value, and adding the data quality control rule information into the data quality control rule information set of the attribute information.
8. The information processing method according to any one of claims 1 to 6,
the proportional calculation result threshold is determined according to the matching result of the entity information; and/or
And the proportion calculation result threshold is determined according to the matching result of the attribute information.
9. An information processing apparatus characterized by comprising:
a memory storing a computer program;
a processor executing the computer program;
wherein the processor, when executing the computer program, implements the steps of the information processing method according to any one of claims 1 to 8.
10. A computer-readable storage medium, comprising:
the computer-readable storage medium stores a computer program that, when executed, implements the steps of the information processing method according to any one of claims 1 to 8.
CN202010168641.3A 2020-03-12 2020-03-12 Information processing method, information processing apparatus, and computer-readable storage medium Pending CN111415749A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010168641.3A CN111415749A (en) 2020-03-12 2020-03-12 Information processing method, information processing apparatus, and computer-readable storage medium
PCT/CN2020/096108 WO2021179461A1 (en) 2020-03-12 2020-06-15 Information processing method and device, and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010168641.3A CN111415749A (en) 2020-03-12 2020-03-12 Information processing method, information processing apparatus, and computer-readable storage medium

Publications (1)

Publication Number Publication Date
CN111415749A true CN111415749A (en) 2020-07-14

Family

ID=71492855

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010168641.3A Pending CN111415749A (en) 2020-03-12 2020-03-12 Information processing method, information processing apparatus, and computer-readable storage medium

Country Status (2)

Country Link
CN (1) CN111415749A (en)
WO (1) WO2021179461A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113127482B (en) * 2019-12-31 2024-03-26 奇安信科技集团股份有限公司 Data quality analysis method, device, computer equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130253947A1 (en) * 2012-03-20 2013-09-26 David Drabo System for migrating personal health information and methods thereof
CN104156415A (en) * 2014-07-31 2014-11-19 沈阳锐易特软件技术有限公司 Mapping processing system and method for solving problem of standard code control of medical data
CN104657396A (en) * 2013-11-25 2015-05-27 腾讯科技(深圳)有限公司 Data migration method and device
CN105335378A (en) * 2014-06-25 2016-02-17 富士通株式会社 Multi-data source information processing device and method, and server
CN110297813A (en) * 2019-05-22 2019-10-01 平安银行股份有限公司 Data migration method, device, computer equipment and storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8666919B2 (en) * 2011-07-29 2014-03-04 Accenture Global Services Limited Data quality management for profiling, linking, cleansing and migrating data
CN103309945A (en) * 2013-05-15 2013-09-18 上海证券交易所 Device for importing data to database
CN105335412B (en) * 2014-07-31 2019-06-11 阿里巴巴集团控股有限公司 For data conversion, the method and apparatus of Data Migration
CN108241618B (en) * 2016-12-23 2022-05-17 航天信息股份有限公司 Database migration method and device and service program migration method and device
CN110825813B (en) * 2019-11-14 2022-05-03 中国民航信息网络股份有限公司 Data migration method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130253947A1 (en) * 2012-03-20 2013-09-26 David Drabo System for migrating personal health information and methods thereof
CN104657396A (en) * 2013-11-25 2015-05-27 腾讯科技(深圳)有限公司 Data migration method and device
CN105335378A (en) * 2014-06-25 2016-02-17 富士通株式会社 Multi-data source information processing device and method, and server
CN104156415A (en) * 2014-07-31 2014-11-19 沈阳锐易特软件技术有限公司 Mapping processing system and method for solving problem of standard code control of medical data
CN110297813A (en) * 2019-05-22 2019-10-01 平安银行股份有限公司 Data migration method, device, computer equipment and storage medium

Also Published As

Publication number Publication date
WO2021179461A1 (en) 2021-09-16

Similar Documents

Publication Publication Date Title
US11372851B2 (en) Systems and methods for rapid data analysis
JP6357162B2 (en) Data profiling using location information
US20150310644A1 (en) Efficient representations of graphs with multiple edge types
CN105551022B (en) A kind of image error matching inspection method based on shape Interactive matrix
CN111897975A (en) Local training method for learning training facing knowledge graph representation
CN109117440B (en) Metadata information acquisition method, system and computer readable storage medium
CN103473373A (en) Threshold matching model-based similarity analysis system and threshold matching model-based similarity analysis method
CN105608113B (en) Judge the method and device of POI data in text
CN111651641B (en) Graph query method, device and storage medium
CN105095188B (en) Sentence similarity computational methods and device
CN110647913B (en) Abnormal data detection method and device based on clustering algorithm
CN109684629B (en) Method and device for calculating similarity between texts, storage medium and electronic equipment
CN111125229A (en) Data blood margin generation method and device and electronic equipment
CN109471874A (en) Data analysis method, device and storage medium
CN115374129B (en) Database joint index coding method and system
CN107623924A (en) It is a kind of to verify the method and apparatus for influenceing the related Key Performance Indicator KPI of Key Quality Indicator KQI
CN111415749A (en) Information processing method, information processing apparatus, and computer-readable storage medium
Hu et al. Subgroup analysis in the heterogeneous Cox model
WO2018227773A1 (en) Place recommendation method and apparatus, computer device, and storage medium
CN115296984A (en) Method, device, equipment and storage medium for detecting abnormal network nodes
CN113672653A (en) Method and device for identifying private data in database
CN116680356A (en) Address data processing method and device, electronic equipment and storage medium
CN105808735B (en) Data processing method and device
CN116955538A (en) Medical dictionary data matching method and device, electronic equipment and storage medium
CN110008972B (en) Method and apparatus for data enhancement

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
AD01 Patent right deemed abandoned

Effective date of abandoning: 20231103

AD01 Patent right deemed abandoned